pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.87k stars 21.33k forks source link

failing to compile with GPU=1 #200

Closed zacbayhan closed 7 years ago

zacbayhan commented 7 years ago

Started playing with your project after catching the ted talk, very cool project. I've gotten it to compile with GPU=0, however When setting GPU=1 I get the following error.

/usr/include/string.h: In function ‘void* __mempcpy_inline(void*, const void*, size_t)’:
/usr/include/string.h:652:42: error: ‘memcpy’ was not declared in this scope
   return (char *) memcpy (__dest, __src, __n) + __n;
                                          ^
compilation terminated due to -Wfatal-errors.
Makefile:88: recipe for target 'obj/convolutional_kernels.o' failed
make: *** [obj/convolutional_kernels.o] Error 1

I'll keep tinkering and see if I can't figure it out just thought it might be worth reporting.

25b3nk commented 7 years ago

Firstly check if the CUDA path is correct in the Mkefile.

ifeq ($(GPU), 1) 
COMMON+= -DGPU -I/usr/local/cuda/include/
CFLAGS+= -DGPU
LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand
endif

In the above part, /usr/local/cuda/ is your CUDA path. New CUDA-8.0 will be having path /usr/local/cuda-8.0/. Check that once.

After that too you get the error message, execute this line export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}} , before make.

zacbayhan commented 7 years ago

Thanks, I'll have a look Monday morning. I kind of assumed it was an error on my part

On Sep 15, 2017 02:19, "Bhaskar C" notifications@github.com wrote:

Firstly check if the CUDA path is correct in the Mkefile.

ifeq ($(GPU), 1) COMMON+= -DGPU -I/usr/local/cuda/include/ CFLAGS+= -DGPU LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand endif

In the above part, /usr/local/cuda/ is your CUDA path. New CUDA-8.0 will be having path /usr/local/cuda-8.0/. Check that once.

After that too you get the error message, execute this line export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}} , before make.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/200#issuecomment-329692411, or mute the thread https://github.com/notifications/unsubscribe-auth/AO8y7mxPUQkEfPlGHHjXUvsJ2CbHt7NNks5sihbwgaJpZM4PX6AJ .

deepakcrk commented 7 years ago

I was getting same errors, when I installed proper nvidia drivers I was able to fix this

loretoparisi commented 7 years ago

@zacbayhan so did you solve this? You can close it then :)

zacbayhan commented 7 years ago

Sorry I got distracted, I'll look into it tomorrow and check all the path's. I'll close it and reopen it if I run into any issues. Thanks for the reminder.

On Mon, Sep 25, 2017 at 11:35 AM, Loreto Parisi notifications@github.com wrote:

@zacbayhan https://github.com/zacbayhan so did you solve this? You can close it then :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/200#issuecomment-331920111, or mute the thread https://github.com/notifications/unsubscribe-auth/AO8y7m_lcbOIhT4-psZRbdVfu7_bXk_nks5sl8hIgaJpZM4PX6AJ .

jazoom commented 6 years ago

I'm trying to install YOLO on Google's Colaboratory and I can't get it to compile with CUDA. I'm not sure what path to add because there isn't a /cuda directory anywhere (checked using find).

Pytorch seems to just work.

How do I get YOLO to work?

zacbayhan commented 6 years ago

Can you do a nvidia-smi (linux) and see info on your gpu?

On Wed, Apr 11, 2018, 18:37 jazoom notifications@github.com wrote:

I'm trying to install YOLO on Google's Colaboratory and I can't get it to compile with CUDA. I'm not sure what path to add because there isn't a /cuda directory anywhere (checked using find).

Pytorch seems to just work.

How do I get YOLO to work?

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/200#issuecomment-380617669, or mute the thread https://github.com/notifications/unsubscribe-auth/AO8y7pNRLskRF-V9EnSkwoAZ4SG0fVHvks5tnoWVgaJpZM4PX6AJ .

jazoom commented 6 years ago

Thanks for the reply. Is this what you were asking for? https://colab.research.google.com/drive/1V65EmJc7h5cf1A_fJsdZ__-B9pW6o3fh

Edit: I figured seeing as I shared this notebook with you I may as well show you the exact implementation for trying to get YOLO working. Feel free to mess around with the commands and run them yourself. This is just a playground to demonstrate this issue.

zacbayhan commented 6 years ago

Driving at the moment so can't take a good look but it looks like it's not finding the needed libraries. Or they are in a different location. I'll take a better look in a few hours, if you know they are installed might try creating a symbolic link to usr/local/cuda

On Wed, Apr 11, 2018, 18:59 jazoom notifications@github.com wrote:

Thanks for the reply. Is this what you were asking for? https://colab.research.google.com/drive/1V65EmJc7h5cf1A_fJsdZ__-B9pW6o3fh

— You are receiving this because you modified the open/close state. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/200#issuecomment-380621860, or mute the thread https://github.com/notifications/unsubscribe-auth/AO8y7nK7LELhdl5CuXLr_Dy-xSfAoix4ks5tnorUgaJpZM4PX6AJ .

jazoom commented 6 years ago

I'm not sure what path I would need to create a symbolic link to.

There's no rush. Though, it'll be good to get this sorted out because I'm sure more and more people are going to want to run YOLO on Colaboratory in future.

youngxiao commented 6 years ago

@csbenk Thanks a lot. It really works.

danishnazir commented 6 years ago

hi @jazoom it would be great if you can share the notebook which contains the installation instructions for installing darknet on colab

jazoom commented 6 years ago

@danishnazir I never got it working and moved on to other things. I just dug up the relevant code from the notebook. I'll paste it below for anyone who wants it.

# install darknet with YOLOv3
!git clone https://github.com/pjreddie/darknet

PATH_TO_CUDA = '/usr/lib/x86_64-linux-gnu/'

STRINGS_TO_REPLACE = [
    ('GPU=0', 'GPU=1'), #change the first line of the Makefile to say "GPU=1" instead of "GPU=0" so it will be compiled to use CUDA
    ('/usr/local/cuda/', PATH_TO_CUDA)
]

# edit the model config file to match our custom requirements
for line in fileinput.input('./darknet/Makefile', inplace=True):
    replaced = False
    for search, replace in STRINGS_TO_REPLACE:
        if search in line:
            print(line.rstrip().replace(search, replace))
            replaced = True
            break
    if not replaced:
        print(line.rstrip())
fileinput.close()

# #change the first line of the Makefile to say "GPU=1" instead of "GPU=0" so it will be compiled to use CUDA
# FIRST_LINE = 'GPU=1'
# with open('./darknet/Makefile') as f:
#     lines = f.readlines()
# lines[0] = FIRST_LINE + '\n'
# with open('./darknet/Makefile', 'w') as f:
#     f.writelines(lines)

!cd darknet && make
!echo DONE
danishnazir commented 6 years ago

@jazoom i got it working thanks for your help though :)

jazoom commented 6 years ago

@danishnazir perhaps you can share how you got it working?

danishnazir commented 6 years ago

sure 1) follow the instruction here to install CUDA https://colab.research.google.com/drive/14OyDrmxzBmkJ8H51iodPE2aXHzCduKJP#scrollTo=bOHa-Sj8ywxn 2) then simply change the makefile of the darknet and set GPU=1 and compile it :)

jazoom commented 6 years ago

Oh wow. Install CUDA? I thought CUDA was already installed on Colaboratory. I guess this is to install a second one that isn't tired to TF?

Thanks for sharing.

danishnazir commented 6 years ago

no cuda is not installed by default you have to install it in order to compile darknet welcome :)

jazoom commented 6 years ago

That explains it. But it already is installed somewhere since TF has access to it.

cadip92 commented 6 years ago

Firstly check if the CUDA path is correct in the Mkefile.

ifeq ($(GPU), 1) 
COMMON+= -DGPU -I/usr/local/cuda/include/
CFLAGS+= -DGPU
LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand
endif

In the above part, /usr/local/cuda/ is your CUDA path. New CUDA-8.0 will be having path /usr/local/cuda-8.0/. Check that once.

After that too you get the error message, execute this line export PATH=/usr/local/cuda-8.0/bin${PATH:+:${PATH}} , before make.

Hello! I have installed Cuda10.0 and cudnn on Ubuntu 18.04 and also updated my PATH and LD_LIBRARY_PATH as per the path specified in the Installation Guide. However, as i run the Makefile to compile darknet on GPU; it gives me the below error. I am unable to understand where the exact issue is.

/usr/bin/x86_64-linux-gnu-ld: -lcuda kann nicht gefunden werden(cannot be found) collect2: error: ld returned 1 exit status Makefile:83: recipe for target 'libdarknet.so' failed make: *** [libdarknet.so] Error 1

Have i done something wrong with the installation?

Khumayun commented 6 years ago

I have exactly the same problem as @cadip92. Please suggest at least something. I really got stuck with it.

joshinakul commented 5 years ago

i have successfully compiled it with cuda-10 things to remember: 1) after cloning darknet, make it with root (sudo) 2) set NVCC = /usr/local/cuda/bin/nvcc

ThegreatShible commented 5 years ago

I had an issue in Google Colab with DarkNet makefile when enabling GPU. I solved it by

  1. Executing everything as root : git clone, make .... (I don't think this point is relevant but u nerve know)

  2. Using this ARCH variable : ARCH= -gencode arch=compute_37,code=sm_37

  3. specifying nvcc path in makefile : NVCC=/usr/local/cuda/bin/nvcc

hope it helps

forzagreen commented 5 years ago

$ ln -s /usr/local/cuda /usr/local/cuda-8.0/

This is because the Makefile is referencing /usr/local/cuda:

ifeq ($(GPU), 1)
COMMON+= -DGPU -I/usr/local/cuda/include/
CFLAGS+= -DGPU
LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand
endif
barresoft commented 5 years ago

My solution using cuda 10.1:

on MakeFile replace for this:

ifeq ($(GPU), 1)
    COMMON+= -DGPU -I/usr/local/cuda/include/
    CFLAGS+= -DGPU
    #ifeq ($(OS),Darwin) #MAC
    #LDFLAGS+= -L/usr/local/cuda/lib -lcuda -lcudart -lcublas -lcurand
    #else
    LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand
    #endif
endif

execute this before make: export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}

andeyeluguo commented 5 years ago

cuda9.0 may work, you may change the /usr/local/cuda to /usr/local/cuda-9.0

domhel commented 4 years ago

i have successfully compiled it with cuda-10 things to remember:

1. after cloning darknet, make it with root (sudo)

2. set NVCC = /usr/local/cuda/bin/nvcc

this worked for me, thanks

azzageee commented 4 years ago

My solution using cuda 10.1:

on MakeFile replace for this:

ifeq ($(GPU), 1)
  COMMON+= -DGPU -I/usr/local/cuda/include/
  CFLAGS+= -DGPU
  #ifeq ($(OS),Darwin) #MAC
  #LDFLAGS+= -L/usr/local/cuda/lib -lcuda -lcudart -lcublas -lcurand
  #else
  LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand
  #endif
endif

execute this before make: export PATH=/usr/local/cuda/bin${PATH:+:${PATH}}

This worked for Cuda 10.2. Thanks heaps @barresoft !

vokhidovhusan commented 4 years ago

On Ubuntu 20.04, CUDA 10.1, cudnn 7.6.5, Tensorflow 2.3.0 First check where is your cuda

$whereis cuda
>cuda: /usr/lib/cuda /usr/include/cuda.h

In my case cuda was here: /usr/lib/cuda/.

So i have changed CUDA in the Mkefile like following

from

ifeq ($(GPU), 1) 
COMMON+= -DGPU -I/usr/local/cuda/include/
CFLAGS+= -DGPU
LDFLAGS+= -L/usr/local/cuda/lib64 -lcuda -lcudart -lcublas -lcurand
endif

to

ifeq ($(GPU), 1) 
COMMON+= -DGPU -I/usr/lib/cuda/include/
CFLAGS+= -DGPU
LDFLAGS+= -L/usr/lib/cuda/lib64 -lcuda -lcudart -lcublas -lcurand
endif
sainisatish commented 3 years ago

I had an issue in Google Colab with DarkNet makefile when enabling GPU. I solved it by

  1. Executing everything as root : git clone, make .... (I don't think this point is relevant but u nerve know)
  2. Using this ARCH variable : ARCH= -gencode arch=compute_37,code=sm_37
  3. specifying nvcc path in makefile : NVCC=/usr/local/cuda/bin/nvcc

hope it helps

Working fine

JDParker714 commented 3 years ago

Ayo hi there im JD - here's how u solve this issue.

1 - do whereis cuda, whereis nvcc, and whereis cudnn to figure out your local paths 2 - edit the makefile and replace the existing paths with these correct paths 3 - Go to the src folder and locate network_kernels.cu. Find the one mention of cudaStreamCaptureModeGlobal - and delete it.
4 - Sudo make

This worked for me - if it still doesnt work i can post some screenshots - im using cuda 10.0

priyankaakre27 commented 3 years ago

gcc -Iinclude/ -I3rdparty/stb/include -DOPENCV pkg-config --cflags opencv4 2> /dev/null || pkg-config --cflags opencv -DGPU -I/usr/local/cuda/include/ -DCUDNN -DCUDNN_HALF -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -fPIC -Ofast -DOPENCV -DGPU -DCUDNN -I/usr/local/cudnn/include -DCUDNN_HALF -fPIC -c ./src/convolutional_layer.c -o obj/convolutional_layer.o ./src/convolutional_layer.c: In function ‘cudnn_convolutional_setup’: ./src/convolutional_layer.c:286:24: error: ‘CUDNN_CONVOLUTION_FWD_PREFER_FASTEST’ undeclared (first use in this function); did you mean ‘CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3’? int forward_algo = CUDNN_CONVOLUTION_FWD_PREFER_FASTEST; ^~~~~~~~ CUDNN_CONVOLUTION_BWD_FILTER_ALGO_3 compilation terminated due to -Wfatal-errors. Makefile:162: recipe for target 'obj/convolutional_layer.o' failed make: *** [obj/convolutional_layer.o] Error 1

varungupta31 commented 1 year ago

To everyone still subscribed to this thread, please help out with my issue #2605, TIA.