AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )
http://pjreddie.com/darknet/
Other
21.79k stars 7.97k forks source link

Darknet does not compile on Kaggle #6163

Closed ccuetoh closed 4 years ago

ccuetoh commented 4 years ago

Hi all,

I'm trying to compile (with make) Darknet on a Kaggle Notebook, but I keep running into this problem:

/usr/lib/gcc/x86_64-linux-gnu/7/../../../x86_64-linux-gnu/libcuda.so: file not recognized: File truncated
collect2: error: ld returned 1 exit status
Makefile:161: recipe for target 'darknet' failed
make: *** [darknet] Error 1

CUDA version 10.1 is installed, and the host system is Ubuntu Linux. I'm running this script for the installation:

git clone https://github.com/AlexeyAB/darknet

cd darknet

sed -i 's/OPENCV=0/OPENCV=1/' Makefile
sed -i 's/GPU=0/GPU=1/' Makefile
sed -i 's/CUDNN=0/CUDNN=1/' Makefile
sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile

make

As per this example mentioned in the readme: https://colab.research.google.com/drive/12QusaaRj_lUwCGDvQNfICpa7kA7_a2dE#scrollTo=xym8_m8CIyXK

Any help is welcome!

AlexeyAB commented 4 years ago

It seems that file libcuda.so is borken. Can you reinstall CUDA?

Show output of commands:

nvcc --version
gcc --version
nvidia-smi
dsbyprateekg commented 4 years ago

@CamiloHernandez I had faced the same issue while building the Darknet in kaggle notebook for the Wheat Head Detection challenge. [Please note internet is not allowed in that challenge]

For a quick around I have built the darknet repo in colab(the link you have already mentioned) and from there downloaded the compiled Darknet folder and then uploaded it in my kaggle notebook as input data.

But in this case we will face /bin/sh: 1: ./darknet: Permission denied error while running !./darknet detector test.. command.

I tried using !chmod +x darknet but it says chmod: changing permissions of 'darknet': Read-only file system.

Due to above issue I am unable to run !./darknet detector test... command there. It seems we need to wait OpenCV next release to use YOLOv4 but this release is not confirmed yet and this competition is going to end in 24 days :(

@AlexeyAB I have printed the versions- nvcc

nvidia

gcc

AlexeyAB commented 4 years ago

Due to above issue I am unable to run !./darknet detector test... command there. It seems we need to wait OpenCV next release to use YOLOv4 but this release is not confirmed yet and this competition is going to end in 24 days :(

make -j4 sudo make install sudo ldconfig



or just run this script that will download and install OpenCV-master-branch: https://raw.githubusercontent.com/ceccocats/tkDNN/master/scripts/install_OpenCV4.sh
ccuetoh commented 4 years ago

@dsbyprateekg Yup, I'm trying to do the same Competition. Thanks for the heads up, I did not realize that no internet access was allowed.

@AlexeyAB Thanks for the suggestions. I will take another stab at it during the coming days and report back, though it's looking rather grim because of the limitations of the provided environment.

dsbyprateekg commented 4 years ago

@CamiloHernandez Let me know also if you get succeeded in making the submission file. I tried everything but failed to make the submission file. I was even frustrated to see how easily people are making submission in PyTorch using ultralytics/yolov5 :(

AlexeyAB commented 4 years ago

@WongKinYiu Hi, What YOLOv4-model for ultralytics/yolov5/yolov3 we can publicy share? Or for another Pytorch implementation?

WongKinYiu commented 4 years ago

@AlexeyAB

I will share yolov4-leaky and yolov4-csp-leaky models which are mentioned in the model zoo. The code is not yet fully sort out, but okay, I can release the preview version today.

ccuetoh commented 4 years ago

Sadly after giving it some thought and experimenting a bit I'm fairly sure there is no way of running Darknet on a Kaggle Notebook under the conditions of this competition:

  1. If you want to compile Darknet inside the notebook then you'll need to download Cuda and OpenCV to fix the issues before mentioned. Plus, obviously cloning this repo. This is disallowed since a requirement is that no internet connection is used.
  2. If you try to circumvent this problem by passing either a compiled version or any of the required dependencies as a dataset (which is allowed) then they'll be read-only and can't be executed.

The only option that I can come with is that you could make a Python script (cp won't work) that reads the files one by one and dumps them in the working directory so they can be executed.

Thanks either way for the help!

I will close this issue for now.

WongKinYiu commented 4 years ago

@AlexeyAB

YOLOv4: ultralytics/yolov3 based PyTorch implementation Model Test Size APval AP50val AP75val APSval APMval APLval
YOLOv4pacsp-s 736 36.0% 54.2% 39.4% 18.7% 41.2% 48.0%
YOLOv4pacsp 736 46.4% 64.8% 51.0% 28.5% 51.9% 59.5%
YOLOv4: ultralytics/yolov5 based PyTorch implementation Model Test Size APval AP50val AP75val APSval APMval APLval
YOLOv4pacsp-s 736 38.9% 58.0% 42.1% 22.3% 44.0% 49.3%
YOLOv4pacsp 736 46.9% 66.0% 51.2% 29.7% 52.7% 59.6%
YOLOv4pacsp-x 736 48.6% 67.3% 53.2% 32.1% 54.0% 62.2%
AlexeyAB commented 4 years ago

@WongKinYiu Thanks! So YOLOv4 for ultralytics/yolov5 has higher both AP(+2) and AP50(+3)?

WongKinYiu commented 4 years ago

@AlexeyAB

No, AP(+0.5) and AP50(+1.2). I think you compare results with x version.

AlexeyAB commented 4 years ago

@CamiloHernandez I think that you are doing something wrong.

If you can only use the Pytorch, then you can use these YOLOv4-models on Pytorch implementation:

If you want to compile Darknet inside the notebook then you'll need to download Cuda and OpenCV to fix the issues before mentioned. Plus, obviously cloning this repo. This is disallowed since a requirement is that no internet connection is used.

The most Neural Network algorithms require CUDA: Darknet, Pytorch, TensorFlow...

If you try to circumvent this problem by passing either a compiled version or any of the required dependencies as a dataset (which is allowed) then they'll be read-only and can't be executed.

I use Darknet in read-only mode successfully.

AlexeyAB commented 4 years ago

@WongKinYiu

I just see AP and AP50 in your posted table. It seems Test Size should be different for each row.

WongKinYiu commented 4 years ago

@AlexeyAB

Results are obtained by different models with same test size. image

ccuetoh commented 4 years ago

@AlexeyAB

I think that you are doing something wrong.

If you can only use the Pytorch, then you can use these YOLOv4-models on Pytorch implementation: https://github.com/WongKinYiu/PyTorch_YOLOv4 https://github.com/WongKinYiu/PyTorch_YOLOv4/tree/u5_preview

I can't use the internet at any step of the process. Thus pip (or git clone) is not an option. You can only work with the pre-installed libraries and dependencies. I will take a look at it anyway, using the second path I described earlier.

I use Darknet in read-only mode successfully.

As per the response from @dsbyprateekg:

But in this case we will face /bin/sh: 1: ./darknet: Permission denied error while running !./darknet detector test.. command.

I tried using !chmod +x darknet but it says chmod: changing permissions of 'darknet': Read-only file system.

I guess the problem stems from the fact that not only is darknet read-only, but also the whole directory.

AlexeyAB commented 4 years ago

@WongKinYiu Oh, it’s so small, I didn’t notice even after you noticed it) Thanks)

AlexeyAB commented 4 years ago

@CamiloHernandez

The reason is that, you can copy any library, but you don't have permission even to read it. In this case you can't use any code/framework/library from internet, this is very strange.

But in this case we will face /bin/sh: 1: ./darknet: Permission denied error while running !./darknet detector test.. command.

dsbyprateekg commented 4 years ago

@AlexeyAB I searched in Kaggle and found that in many challenges they restrict internet usage due to below reasons-

  1. Kaggle kernels are run against hidden test set, it won't be hidden any more if internet was allowed.
  2. Another reason is that reproducibility is much better without internet (you don't suffer from random version upgrades or github/pypi going down).
  3. To avoid any malfunction of the Competition Website.

I am sharing here a notebook submitted with ultralytics/yolov5+Pytorch and currently one of the top score (0.7488) in leaderboard- https://www.kaggle.com/orkatz2/yolov5-fake-or-real-single-model-l-b-0-753/notebook

May be you get an idea from the above notebook and share us some insights. I am very eager to make a submission using AlexeyAB/darknet/YOLOv4 since no body is using it in the challenge. I have already trained the dataset with it. Only I need to make the predictions on test images and prepare the submission file as per the challenge.

AlexeyAB commented 4 years ago

@dsbyprateekg

I am sharing here a notebook submitted with ultralytics/yolov5+Pytorch and currently one of the top score (0.7488) in leaderboard- https://www.kaggle.com/orkatz2/yolov5-fake-or-real-single-model-l-b-0-753/notebook

I searched in Kaggle and found that in many challenges they restrict internet usage due to below reasons-

Kaggle kernels are run against hidden test set, it won't be hidden any more if internet was allowed.

I don’t have time to understand the intricacies of Kaggle competitions, and explain step by step how to load your program so that it has the right to run. In any case, there is the opportunity to upload your source or binary code to the server and run it there at least in read-only mode without error Permission denied, otherwise you will not be able to run anything there.

dsbyprateekg commented 4 years ago

@AlexeyAB thanks, I understand your point completely that's why I am stick to your repo whether I am able to submit or not :) I have updated your mentioned link in Kaggle discussion thread so that more people will know about this and they will use this repo hopefully. https://www.kaggle.com/c/global-wheat-detection/discussion/163433

guitarmind commented 4 years ago

Hi @CamiloHernandez @AlexeyAB , I found a working solution to run darknet on Kaggle, FYI. https://www.kaggle.com/markpeng/darknet-gpu-on-kaggle

The Dockerfile that I used to prebuilt darknet: https://github.com/guitarmind/dockerfiles/blob/master/deepo/18.04/Dockerfile https://hub.docker.com/repository/docker/guitarmind/deepo