pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.8k stars 21.33k forks source link

STB Reason: can't fopen #174

Open saipraneethd-zz opened 7 years ago

saipraneethd-zz commented 7 years ago

I am training Darknet YOLO on Amazon EC2, p2.xlarge instance Kindly help me with this error

My Makefile GPU=1 CUDNN=1 OPENCV=0 OPENMP=0 DEBUG=0

Command used ./darknet detector train data/obj.data yolo-obj.cfg darknet19_448.conv.23

capture

jinyu121 commented 7 years ago
  1. Do these image files exist?
  2. Try absolute image path.
saipraneethd-zz commented 7 years ago

I found the solution. My .txt file for the corresponding images was Infinite Infinite Infinite Infinite. Therefore, it wasn't working

feiyunzhang commented 7 years ago

you also can compile the darknet with opencv on (opencv=1) then this problem will be solved

deepkshikha commented 6 years ago

Hi I am getting the similar error

ubuntu@ip-172-30-6-221:~/darknet$ ./darknet yolo train cfg/yolo.cfg extraction.conv.weights Cannot load image "data/labels/crater.png" STB Reason: can't fopen

I don't have any png file in labels folder. Please suggest . Thanks in advance

capture9

TheMikeyR commented 6 years ago

@deepkshikha when training you should use this format

./darknet detector train path/to/datafile.data path/to/network.cfg path/to/weights 

Example:

./darknet detector train cfg/voc.data cfg/yolo.cfg darknet19_448.conv.23

The example.data file should have this format:

classes= 20
train  = <path-to-voc>/train.txt
valid  = <path-to-voc>2007_test.txt
names = data/voc.names
backup = backup

Since you are not linking to any data file it is using some default test scenario from the source code and that seems to course your error. More info on the website https://pjreddie.com/darknet/yolo/

TheMikeyR commented 6 years ago

@deepkshikha I'm not sure what you are trying to do exactly or what the new error you have is? But if you follow the guide on https://pjreddie.com/darknet/yolo/ you will most definitely be able to get training running. If you want to use your own dataset I can recommend this guide https://timebutt.github.io/static/how-to-train-yolov2-to-detect-custom-objects/

deepkshikha commented 6 years ago

Sorry for the above comment I actually wrongly put the same comment here are the error that I am getting

ubuntu@ip-172-30-6-221:~/darknet$ ./darknet detector train cfg/crater.data cfg/yolo-voc.cfg darknet_448.conv.22 Not an option: detector

deepkshikha commented 6 years ago

I am trying on my own dataset only

TheMikeyR commented 6 years ago

Can you try to run the examples from https://pjreddie.com/darknet/yolo/ Detection Using A Pre-Trained Model if that works out, then you should try to train using voc also following the guide on the website and if that works there are something wrong with your custom data. Report back if you get stuck.

deepkshikha commented 6 years ago

Thanks for response ...start training with our own data http://guanghan.info/blog/en/my-works/train-yolo/ ... below is the error I am getting

On Thu, Nov 9, 2017 at 2:41 PM, Mike Røntved notifications@github.com wrote:

Can you try to run the examples from https://pjreddie.com/darknet/yolo/ Detection Using A Pre-Trained Model if that works out, then you should try to train using voc also following the guide on the website and if that works there are something wrong with your custom data. Report back if you get stuck.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-343092282, or mute the thread https://github.com/notifications/unsubscribe-auth/AMkuQc4lytfxW0RlXngXWCcuB8zMSjxmks5s0sGrgaJpZM4PFgjH .

TheMikeyR commented 6 years ago

You didn't post any error @deepkshikha

deepkshikha commented 6 years ago

Hey I have added the image file . I am trying this http://guanghan.info/blog/en/my-works/train-yolo/

I run "./darknet train cfg/crater.data cfg/yolo.cfg extraction.conv.weights" and I get a Segmentation Fault(core-dumped). I'm using:

This is the configuration I am using GPU=1 CUDNN=1 OPENCV=0 DEBUG=1

On Fri, Nov 10, 2017 at 1:06 PM, Mike Røntved notifications@github.com wrote:

You didn't post any error @deepkshikha https://github.com/deepkshikha

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-343396776, or mute the thread https://github.com/notifications/unsubscribe-auth/AMkuQZeJytReQccSmLQ-k5ZPRjNLG7Kaks5s0_0KgaJpZM4PFgjH .

TheMikeyR commented 6 years ago

If you can't run the example or the guide above then there is something wrong with your libraries, try to reinstall cuda. Also I see you are missing CUDA=1 in your configuration, you need this to be able to use CUDNN flag.

deepkshikha commented 6 years ago

Training has done But after training its not predicting any boundary box for test image . Its appearing as below

On Fri, Nov 10, 2017 at 2:56 PM, Mike Røntved notifications@github.com wrote:

If you can't run the example or the guide above then there is something wrong with your libraries, try to reinstall cuda. Also I see you are missing CUDA=1 in your configuration, you need this to be able to use CUDNN flag.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-343422043, or mute the thread https://github.com/notifications/unsubscribe-auth/AMkuQVtsdnMVwwnwOCfR3ePvuuzmDFLGks5s1BbAgaJpZM4PFgjH .

SteveIb commented 6 years ago

It might be useful to share my experience, I prepared my data on windows , and then I run in on Mac os.

I run the command cat -v train.txt I got the following

/Images/000/ZT135_15_A_4_75001000.jpg^M /Images/000/ZT135_15_A_4_710001000.jpg^M /Images/000/ZT76_17_A_1_800.jpg^M

I removed the ^M by the following command tr -d '\r' < input.file > output.file

So, when you are moving from windows to mac or vice versa take care for the carriage return and new line

here is the comment which helped me

"Shai3 months 14 days ago The UTF-8 didn’t help, but your post gave me the idea to check for similar things. I was editing the text file in notepad++ on windows, so it used the Windows CR LF system, changed it to Unix LF and it worked! thanks!!"

I thought that would be helpful since it consumed my time!!

enriqueav commented 6 years ago

I had the same issue commented by @SteveIb but the tr command didn't work for me on Mac OS. This is what did the trick in vi

:set fileformat=unix

And save the file.

varenaggarwal commented 6 years ago

can i train the dataset on cpu alone

deepkshikha commented 6 years ago

yes @varen27 make changes in makefile cuda and CudNn to 0 and run but it will take longer time

varenaggarwal commented 6 years ago

@deepkshikha they were already 0 but still I am facing this error

deepkshikha commented 6 years ago

What error you are getting. I have trained on both successfully

On Thursday, June 14, 2018, Varen Aggarwal notifications@github.com wrote:

@deepkshikha https://github.com/deepkshikha they were already 0 but still I am facing this error

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-397152011, or mute the thread https://github.com/notifications/unsubscribe-auth/AMkuQZRUShyI97skmKYQqsDyzAGxSvC9ks5t8c6MgaJpZM4PFgjH .

varenaggarwal commented 6 years ago

@deepkshikha Loading weights from darknet19_448.conv.23...Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Resizing 384 Cannot load image "/data/obj/pic18.JPG" STB Reason: can't fopen

it shows this and then program terminates.

deepkshikha commented 6 years ago

Please check the path of images directory in train.txt and test.txt . This is not CPU error This is coming because image is not able to upload please check path and opencv is installed or not properly and able to import. Then if it doesn't run check image.c

On Thursday, June 14, 2018, Varen Aggarwal notifications@github.com wrote:

@deepkshikha https://github.com/deepkshikha Loading weights from darknet19_448.conv.23...Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Resizing 384 Cannot load image "/data/obj/pic18.JPG" STB Reason: can't fopen

it shows this and then program terminates.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-397229165, or mute the thread https://github.com/notifications/unsubscribe-auth/AMkuQePMtFd4IPioXFnALORqe4vd_WxFks5t8ipxgaJpZM4PFgjH .

varenaggarwal commented 6 years ago

Since I'm training on CPU do in need to install Cuda ?

On Thu 14 Jun, 2018, 3:46 PM Deepshikha, notifications@github.com wrote:

Please check the path of images directory in train.txt and test.txt . This is not CPU error This is coming because image is not able to upload please check path and opencv is installed or not properly and able to import. Then if it doesn't run check image.c

On Thursday, June 14, 2018, Varen Aggarwal notifications@github.com wrote:

@deepkshikha https://github.com/deepkshikha Loading weights from darknet19_448.conv.23...Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Resizing 384 Cannot load image "/data/obj/pic18.JPG" STB Reason: can't fopen

it shows this and then program terminates.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-397229165, or mute the thread < https://github.com/notifications/unsubscribe-auth/AMkuQePMtFd4IPioXFnALORqe4vd_WxFks5t8ipxgaJpZM4PFgjH

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-397245232, or mute the thread https://github.com/notifications/unsubscribe-auth/ADhi9AHkXisXupIaTAQpKbC979sqwbwzks5t8jfngaJpZM4PFgjH .

TheMikeyR commented 6 years ago

@varen27 no, also if you don't have an nvidia gpu you can't install cuda.

deepkshikha commented 6 years ago

No need to install cuda opencv is need to install only

On Friday, June 15, 2018, Varen Aggarwal notifications@github.com wrote:

Since I'm training on CPU do in need to install Cuda ?

On Thu 14 Jun, 2018, 3:46 PM Deepshikha, notifications@github.com wrote:

Please check the path of images directory in train.txt and test.txt . This is not CPU error This is coming because image is not able to upload please check path and opencv is installed or not properly and able to import. Then if it doesn't run check image.c

On Thursday, June 14, 2018, Varen Aggarwal notifications@github.com wrote:

@deepkshikha https://github.com/deepkshikha Loading weights from darknet19_448.conv.23...Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Resizing 384 Cannot load image "/data/obj/pic18.JPG" STB Reason: can't fopen

it shows this and then program terminates.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/pjreddie/darknet/issues/174#issuecomment-397229165 , or mute the thread < https://github.com/notifications/unsubscribe-auth/ AMkuQePMtFd4IPioXFnALORqe4vd_WxFks5t8ipxgaJpZM4PFgjH

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-397245232, or mute the thread https://github.com/notifications/unsubscribe-auth/ ADhi9AHkXisXupIaTAQpKbC979sqwbwzks5t8jfngaJpZM4PFgjH .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-397516235, or mute the thread https://github.com/notifications/unsubscribe-auth/AMkuQciLH8ALRfo9SYwaDGXh8VNA-cxFks5t80OYgaJpZM4PFgjH .

varenaggarwal commented 6 years ago

@deepkshikha thanks i was able to start training but i can make sense of the output. Could you please help me out:

50: nan, nan avg loss, 0.000000 rate, 518.781077 seconds, 3200 images Resizing 480 x 480 Loaded: 14.141070 seconds Region 82 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 29 Region 94 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 34 Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: nan, .5R: -nan(ind), .75R: -nan(ind), co Region 82 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 9 Region 94 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 60 Region 106 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 81 Region 82 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 26 Region 94 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 32 Region 106 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 9 Region 82 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 57 Region 94 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 69 Region 106 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 4 Region 82 Avg IOU: nan, Class: nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 47 Region 94 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 48 Region 106 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 7 Region 82 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 35 Region 94 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 52 Region 106 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 1 Region 82 Avg IOU: nan, Class: nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 52 Region 94 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 37 Region 106 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 90 Region 82 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 14 Region 94 Avg IOU: -nan, Class: -nan, Obj: nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 26 Region 106 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 1

deepkshikha commented 6 years ago

What are the image folder and annotation looks like make sure it's correct. There should be few nan but since in all case your getting nan There is some error in dataset

On Tue, Jun 19, 2018, 9:11 AM Varen Aggarwal notifications@github.com wrote:

@deepkshikha https://github.com/deepkshikha thanks i was able to start training but i can make sense of the output. Could you please help me out:

50: nan, nan avg loss, 0.000000 rate, 518.781077 seconds, 3200 images Resizing 480 x 480 Loaded: 14.141070 seconds Region 82 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 29 Region 94 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 34 Region 106 Avg IOU: -nan(ind), Class: -nan(ind), Obj: -nan(ind), No Obj: nan, .5R: -nan(ind), .75R: -nan(ind), co Region 82 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 9 Region 94 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 60 Region 106 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 81 Region 82 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 26 Region 94 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 32 Region 106 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 9 Region 82 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 57 Region 94 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 69 Region 106 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 4 Region 82 Avg IOU: nan, Class: nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 47 Region 94 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 48 Region 106 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 7 Region 82 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 35 Region 94 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 52 Region 106 Avg IOU: nan, Class: -nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 1 Region 82 Avg IOU: nan, Class: nan, Obj: nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 52 Region 94 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 37 Region 106 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 90 Region 82 Avg IOU: nan, Class: -nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 14 Region 94 Avg IOU: -nan, Class: -nan, Obj: nan, No Obj: -nan, .5R: 0.000000, .75R: 0.000000, count: 26 Region 106 Avg IOU: nan, Class: nan, Obj: -nan, No Obj: nan, .5R: 0.000000, .75R: 0.000000, count: 1

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/pjreddie/darknet/issues/174#issuecomment-398264762, or mute the thread https://github.com/notifications/unsubscribe-auth/AMkuQZ3EJj0g5ekTsGAdTj7HGvmWJAigks5t-HMEgaJpZM4PFgjH .

snadgauda commented 6 years ago

The first thing you should check is that you have permission to edit/read the files necessary.

To check the permissions of files in the current directory use the command ls -l. Any files that have ---------- next to them are not currently accessible.

To allow access to these files use the command chmod 777 <file_name>. For example, if you cannot access obj.data then use chmod 777 obj.data. If you want to change permissions for every file in your directory use chmod 777 -R . Then use ls -l again to ensure that you now have access.

Changing the files to include the absolute path worked on for me on a Mac. However, it did not solve the issue on Windows.

On Windows, I was running into the same error and the issue turned out to be the end of line sequences. Make sure that the end of line sequences are "/n" and not "/r/n". In a text editor (like visual studio code) make sure you have LF and not CRLF and also that the file format is UTF8.

Side Note: I also switched to this version of dark net https://github.com/pengdada/darknet-win-linux as it worked better on Linux.

LolikaPadmanbhan commented 6 years ago

hey I want to use darnet for image classification and complaining darknet with GPU. when i try to run this with one test image it gives me the below error. GNKO-Train:~/Darket/darknet$ ./darknet -i 0 test kite.jpg cfg/alexnet.cfg alexnet.weights Cannot load image "kite.jpg" STB Reason: can't fopen kindly help me in this. Thanks!

deepkshikha commented 6 years ago

@LolikaPadmanbhan check the path of kite.jpg in your system

LolikaPadmanbhan commented 6 years ago

@deepkshikha hey solved it.. it has to be data/kite.jpg i was missing that data/. now am able to run.. anyways thanks for the response.

deepkshikha commented 6 years ago

@LolikaPadmanbhan :)

TingtingAlice commented 6 years ago

hey,i met the problem when make the project like the following: /usr/bin/ld: skipping incompatible /usr/local/cuda/lib64/libcudnn.so when searching for -lcudnn libdarknet.a(convolutional_layer.o): In function cudnn_convolutional_setup': convolutional_layer.c:(.text+0xcbc): undefined reference tocudnnSetConvolutionGroupCount' collect2: error: ld returned 1 exit status Makefile:76: recipe for target 'darknet' failed make: *** [darknet] Error 1

How to fix that?Help me!

TingtingAlice commented 6 years ago

when run on GPU, 102 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BFLOPs 103 conv 128 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 128 0.177 BFLOPs 104 conv 256 3 x 3 / 1 52 x 52 x 128 -> 52 x 52 x 256 1.595 BFLOPs 105 conv 255 1 x 1 / 1 52 x 52 x 256 -> 52 x 52 x 255 0.353 BFLOPs 106 detection Loading weights from darknet53.conv.74...Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Resizing 608 Floating point exception (core dumped)

TingtingAlice commented 6 years ago

when on GPU, I modified Makefile like this GPU=1 CUDNN=0 OPENCV=1 OPENMP=0 DEBUG=0

no errors when makeing YOLOv3. But when run ./darknet detector train cfg/Det.data cfg/yolov3-det.cfg darknet53.conv.74,it errors!!

Loading weights from darknet53.conv.74...Done! Learning Rate: 0.001, Momentum: 0.9, Decay: 0.0005 Resizing 352 "annot load image "/home/dataset/Det_datasets/yolo/train/images/15_109.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/14_110.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/120_124.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/5_24.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/26_18.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/9_240.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/180_51.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/27_23.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/100_131.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/140_41.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/35_87.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/24_244.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/191_92.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/43_64.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/120_108.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/145_121.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/72_330.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/12_4.jpg "annot load image "/home/dataset/Det_datasets/yolo/train/images/113_110.jpg Couldn't open file: /home/dataset/Det_datasets/yolo/train/labels/35_87.txt Couldn't open file: /home/dataset/Det_datasets/yolo/train/labels/180_51.txt Couldn't open file: /home/dataset/Det_datasets/yolo/train/labels/140_41.txt Couldn't open file: /home/dataset/Det_datasets/yolo/train/labels/24_244.txt Error in `./darknet': double free or corruption (fasttop): 0x0000000000e00270 ======= Backtrace: ========= Couldn't open file: /home/dataset/Det_datasets/yolo/train/labels/43_64.txt /lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f497ccac37a] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f497ccb053c] /usr/lib/x86_64-linux-gnu/libopencv_highgui.so.2.4(_ZN9ImplMutex7destroyEv+0x14)[0x7f4984e06e14] /lib/x86_64-linux-gnu/libc.so.6(+0x39ff8)[0x7f497cc65ff8] /lib/x86_64-linux-gnu/libc.so.6(+0x3a045)[0x7f497cc66045] ./darknet[0x42376b] ./darknet[0x45c5e1] ./darknet[0x45da36] ./darknet[0x46054a] ./darknet[0x461d86] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f497cffd6ba] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f497cd3341d] ======= Memory map: ======== 00400000-005c6000 r-xp 00000000 fd:01 1988524 /home/workspace/darknet-master/darknet 007c5000-007c6000 r--p 001c5000 fd:01 1988524 /home/workspace/darknet-master/darknet 007c6000-007c7000 rw-p 001c6000 fd:01 1988524 /home/workspace/darknet-master/darknet 00dcc000-0a5e8000 rw-p 00000000 00:00 0 [heap] 200000000-200100000 rw-s 00000000 00:06 464 /dev/nvidiactl 200100000-200104000 rw-s 00000000 00:06 464 /dev/nvidiactl 200104000-200120000 ---p 00000000 00:00 0 200120000-200520000 rw-s 00000000 00:06 464 /dev/nvidiactl 200520000-200524000 rw-s 00000000 00:06 464 /dev/nvidiactl 200524000-200540000 ---p 00000000 00:00 0 200540000-200940000 rw-s 00000000 00:06 464 /dev/nvidiactl 200940000-200944000 rw-s 00000000 00:06 464 /dev/nvidiactl 200944000-200960000 ---p 00000000 00:00 0 200960000-200d60000 rw-s 00000000 00:06 464 /dev/nvidiactl 200d60000-200d64000 rw-s 00000000 00:06 464 /dev/nvidiactl 200d64000-200d80000 ---p 00000000 00:00 0 200d80000-201180000 rw-s 00000000 00:06 464 /dev/nvidiactl 201180000-201184000 rw-s 00000000 00:06 464 /dev/nvidiactl 201184000-2011a0000 ---p 00000000 00:00 0 2011a0000-2015a0000 rw-s 00000000 00:06 464 /dev/nvidiactl 2015a0000-2015a4000 rw-s 00000000 00:06 464 /dev/nvidiactl 2015a4000-2015c0000 ---p 00000000 00:00 0 2015c0000-2019c0000 rw-s 00000000 00:06 464 /dev/nvidiactl 2019c0000-2019c4000 rw-s 00000000 00:06 464 /dev/nvidiactl 2019c4000-2019e0000 ---p 00000000 00:00 0 2019e0000-201de0000 rw-s 00000000 00:06 464 /dev/nvidiactl 201de0000-201de4000 rw-s 00000000 00:06 464 /dev/nvidiactl 201de4000-201e00000 ---p 00000000 00:00 0 201e00000-202200000 rw-s 00000000 00:06 464 /dev/nvidiactl 202200000-202204000 rw-s 00000000 00:06 464 /dev/nvidiactl 202204000-202220000 ---p 00000000 00:00 0 202220000-202620000 rw-s 00000000 00:06 464 /dev/nvidiactl 202620000-202624000 rw-s 00000000 00:06 464 /dev/nvidiactl 202624000-202640000 ---p 00000000 00:00 0 202640000-202a40000 rw-s 00000000 00:06 464 /dev/nvidiactl 202a40000-202a44000 rw-s 00000000 00:06 464 /dev/nvidiactl 202a44000-202a60000 ---p 00000000 00:00 0 202a60000-202e60000 rw-s 00000000 00:06 464 /dev/nvidiactl 202e60000-202e64000 rw-s 00000000 00:06 464 /dev/nvidiactl 202e64000-202e80000 ---p 00000000 00:00 0 202e80000-203280000 rw-s 000Segmentation fault (core dumped)

HELP ME!

TingtingAlice commented 6 years ago

@deepkshikha

romass12 commented 6 years ago

I am gettting "Cant open label file(This can be normal only if you use MSCOCO)

screen shot 2018-07-21 at 9 28 24 am
deepkshikha commented 6 years ago

@TingtingAlice You forgot to put https://github.com/pjreddie/darknet/tree/master/data/labels data/labels so that error is coming

deepkshikha commented 6 years ago

@romass12 You also forgot to pu https://github.com/pjreddie/darknet/tree/master/data/labels data/labels folder

romass12 commented 6 years ago

Now I am trying to calculate anchors : ./darknet detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -heigh 416

But i am getting : k-means++ can't be used without OpenCV, because there is used cvKMeans2 implementation

But , i have already installed opencv via brew (2.4) with python 2.7 Python 2.7.15 (default)

import cv2 print(cv2.version) 2.4.13.6 exit() When editing makefile for OPENCV=1 and make: g++ -DOPENCV pkg-config --cflags opencv -Wall -Wfatal-errors -Wno-unused-result -Wno-unknown-pragmas -Ofast -DOPENCV obj/http_stream.o obj/gemm.o obj/utils.o obj/cuda.o obj/convolutional_layer.o obj/list.o obj/image.o obj/activations.o obj/im2col.o obj/col2im.o obj/blas.o obj/crop_layer.o obj/dropout_layer.o obj/maxpool_layer.o obj/softmax_layer.o obj/data.o obj/matrix.o obj/network.o obj/connected_layer.o obj/cost_layer.o obj/parser.o obj/option_list.o obj/darknet.o obj/detection_layer.o obj/captcha.o obj/route_layer.o obj/writing.o obj/box.o obj/nightmare.o obj/normalization_layer.o obj/avgpool_layer.o obj/coco.o obj/dice.o obj/yolo.o obj/detector.o obj/layer.o obj/compare.o obj/classifier.o obj/local_layer.o obj/swag.o obj/shortcut_layer.o obj/activation_layer.o obj/rnn_layer.o obj/gru_layer.o obj/rnn.o obj/rnn_vid.o obj/crnn_layer.o obj/demo.o obj/tag.o obj/cifar.o obj/go.o obj/batchnorm_layer.o obj/art.o obj/region_layer.o obj/reorg_layer.o obj/reorg_old_layer.o obj/super.o obj/voxel.o obj/tree.o obj/yolo_layer.o obj/upsample_layer.o -o darknet -lm -pthread pkg-config --libs opencv Undefined symbols for architecture x86_64: "cv::cvarrToMat(void const, bool, bool, int, cv::AutoBuffer<double, 136ul>)", referenced from: _send_mjpeg in http_stream.o _image_data_augmentation in http_stream.o "cv::VideoCapture::VideoCapture(cv::String const&)", referenced from: _get_capture_video_stream in http_stream.o "cv::String::deallocate()", referenced from: MJPGWriter::write(cv::Mat const&) in http_stream.o _get_capture_video_stream in http_stream.o cvflann::anyimpl::big_any_policy::static_delete(void) in http_stream.o cvflann::anyimpl::big_any_policy::move(void const, void) in http_stream.o "cv::String::allocate(unsigned long)", referenced from: MJPGWriter::write(cv::Mat const&) in http_stream.o _get_capture_video_stream in http_stream.o "cv::imencode(cv::String const&, cv::_InputArray const&, std::1::vector<unsigned char, std::1::allocator >&, std::1::vector<int, std::1::allocator > const&)", referenced from: MJPGWriter::write(cv::Mat const&) in http_stream.o "_IplImage::_IplImage(cv::Mat const&)", referenced from: _get_webcam_frame in http_stream.o _image_data_augmentation in http_stream.o ld: symbol(s) not found for architecture x86_64 clang: fatal error: linker command failed with exit code 1 (use -v to see invocation) make: *** [darknet] Error 1 fatal error : ld: symbol(s) not found for architecture x86_64

destinyzs commented 6 years ago

i have the same problems as @SteveIb ,it's really help me.

Lvious commented 6 years ago

@SteveIb help me a lot ,thx. from Windows to Ubuntu or other unix-like system, please not forget to convert '\r\n' to '\n'!

NisargKarun commented 5 years ago

Can anyone post the output obtained on training?

jmaity commented 5 years ago

Hi ... I am using YOLO to train my custom data set.I followed the steps that are mentioned. I put the files in the below path:

cfg/obj.data

cfg/obj.names

cfg/yolo-obj.cfg

data/train.txt

data/test.txt

data/obj/all_images

But at the time of training after few iterations it is searching for one file (data/obj/labels.txt) which is not there in the list.No image file named with labels.jpg and also no labels.txt is available. Sometimes it is searching for labels(2).txt or labels(3).txt.. I don’t have any idea from where it is getting this file name.

Please help....I have attached the screen shot of my console.

darknet_error

@TheMikeyR @deepkshikha Please have a look.

sachindesh commented 5 years ago

1) The issue with 'cannot load images', 'segmentation fault (core dump)', 'cannot fopen', 'cannot open label file', is that the files edited in Windows or any operating system that doesn't support Unix style file formats ('\r' line ending) are transferred to Unix boxes (Ubuntu 16 in my case). 2) dos2unix, "tr -d '\r' < file > file" tools used on Ubuntu on txt as well as JPG files, but it doesn't work even. Solution Whatever editing/saving of image files, txt files or any other files, including the marking of objects (yolo_mark tool) should be done only using the Ubuntu or like desktops and not on Windows or non-Unix style operating systems. Cheer!!

srhtyldz commented 5 years ago
  1. The issue with 'cannot load images', 'segmentation fault (core dump)', 'cannot fopen', 'cannot open label file', is that the files edited in Windows or any operating system that doesn't support Unix style file formats ('\r' line ending) are transferred to Unix boxes (Ubuntu 16 in my case).
  2. dos2unix, "tr -d '\r' < file > file" tools used on Ubuntu on txt as well as JPG files, but it doesn't work even. Solution Whatever editing/saving of image files, txt files or any other files, including the marking of objects (yolo_mark tool) should be done only using the Ubuntu or like desktops and not on Windows or non-Unix style operating systems. Cheer!!

I'm using Ubuntu but still get the same error.

sbanerj2 commented 5 years ago

I am getting the following error for some images (custom dataset) as a result of which the training stops. (I am using /path/to/ as an abbreviation of my absolute path. All the files and images exist.)

Loading weights from ./darknet53.conv.74...Done!
Cannot load image "/path/to/Object_Detection/Ground/JPEGImages/Training/6/um_50_na_sunny_sony_0_384.jpg"
STB Reason: can't fopen
Cannot load image "/path/to/Object_Detection/Ground/JPEGImages/Training/6/um_50_na_sunny_sony_0_41.jpg"
STB Reason: unknown image type
Cannot load image "/path/to/Object_Detection/Ground/JPEGImages/Training/3/golf_40_na_cloudy_sony_0_273.jpg"
STB Reason: unknown image type

My images are present in the location too and their corresponding text files are in the following location:
/path/to/Object_Detection/Ground/labels/Training/...

The format of the folders are like Pascal VOC

here's a part of the makefile: GPU=1 CUDNN=0 OPENCV=1 OPENMP=0 DEBUG=0

And here's my ground.data:

classes= 19
train  = /path/to/Object_Detection/ground_train.txt
valid  = /path/to/Object_Detection/ground_val.txt
names = /path/to/darknet/data/ground.names
backup = /path/to/darknet/backup/ground/

Here's the command I am using:

./darknet detector train /path/to/darknet/cfg/ground.data /path/to/darknet/cfg/yolov3-ground.cfg ./darknet53.conv.74 > /path/to/darknet/scripts/ground-train.log

Here's how my ground_train.txt (has all absolute path for images. only a portion of the entire file is shown) looks :

/path/to/Object_Detection/Ground/JPEGImages/Training/0/analog_50_180_sunny_sony_0_0.jpg
/path/to/Object_Detection/Ground/JPEGImages/Training/0/analog_50_180_sunny_sony_0_1.jpg
/path/to/Object_Detection/Ground/JPEGImages/Training/0/analog_50_180_sunny_sony_0_2.jpg
...

All the files exists and are valid. I am not sure what's wrong. It works for other files in the training dataset but only stops for these. I checked whether the images are of correct format, and they seem ok. Can someone help..

sbanerj2 commented 5 years ago

Here's the output after 14 epochs, my training stops after this:

14: 573.572327, 927.539124 avg, 0.000000 rate, 1636.650012 seconds, 896 images
Loaded: 0.000064 seconds
: 0.219096, Class: 0.386432, Obj: 0.530981, No Obj: 0.511523, .5R: 0.000000, .75R: 0.000000,  count: 2

Region 106 Avg IOU: 0.011220, Class: 0.282754, Obj: 0.311355, No Obj: 0.468680, .5R: 0.000000, .75R: 0.000000,  count: 1

Region 82 Avg IOU: 0.279148, Class: 0.148486, Obj: 0.629275, No Obj: 0.458073, .5R: 0.000000, .75R: 0.000000,  count: 1

Region 94 Avg IOU: 0.075961, Class: 0.271695, Obj: 0.082233, No Obj: 0.514065, .5R: 0.000000, .75R: 0.000000,  count: 2

Region 106 Avg IOU: 0.352088, Class: 0.543684, Obj: 0.793743, No Obj: 0.471246, .5R: 0.000000, .75R: 0.000000,  count: 1

Region 82 Avg IOU: 0.165622, Class: 0.815933, Obj: 0.489756, No Obj: 0.457557, .5R: 0.000000, .75R: 0.000000,  count: 1

Region 94 Avg IOU: 0.053709, Class: 0.664887, Obj: 0.541732, No Obj: 0.512764, .5R: 0.000000, .75R: 0.000000,  count: 2

Region 106 Avg IOU: 0.073817, Class: 0.247052, Obj: 0.413561, No Obj: 0.469873, .5R: 0.000000, .75R: 0.000000,  count: 1

14: 573.572327, 927.539124 avg, 0.000000 rate, 1636.650012 seconds, 896 images
Loaded: 0.000064 seconds
HamzahNizami commented 5 years ago

Here's the output after 14 epochs, my training stops after this:

14: 573.572327, 927.539124 avg, 0.000000 rate, 1636.650012 seconds, 896 images
Loaded: 0.000064 seconds
: 0.219096, Class: 0.386432, Obj: 0.530981, No Obj: 0.511523, .5R: 0.000000, .75R: 0.000000,  count: 2

Region 106 Avg IOU: 0.011220, Class: 0.282754, Obj: 0.311355, No Obj: 0.468680, .5R: 0.000000, .75R: 0.000000,  count: 1

Region 82 Avg IOU: 0.279148, Class: 0.148486, Obj: 0.629275, No Obj: 0.458073, .5R: 0.000000, .75R: 0.000000,  count: 1

Region 94 Avg IOU: 0.075961, Class: 0.271695, Obj: 0.082233, No Obj: 0.514065, .5R: 0.000000, .75R: 0.000000,  count: 2

Region 106 Avg IOU: 0.352088, Class: 0.543684, Obj: 0.793743, No Obj: 0.471246, .5R: 0.000000, .75R: 0.000000,  count: 1

Region 82 Avg IOU: 0.165622, Class: 0.815933, Obj: 0.489756, No Obj: 0.457557, .5R: 0.000000, .75R: 0.000000,  count: 1

Region 94 Avg IOU: 0.053709, Class: 0.664887, Obj: 0.541732, No Obj: 0.512764, .5R: 0.000000, .75R: 0.000000,  count: 2

Region 106 Avg IOU: 0.073817, Class: 0.247052, Obj: 0.413561, No Obj: 0.469873, .5R: 0.000000, .75R: 0.000000,  count: 1

14: 573.572327, 927.539124 avg, 0.000000 rate, 1636.650012 seconds, 896 images
Loaded: 0.000064 seconds

any luck?

Mps24-7uk commented 5 years ago

I am getting the same issue https://github.com/pjreddie/darknet/issues/1532 .Please help me with this @saipraneethd @jinyu121 @feiyunzhang @deepkshikha @TheMikeyR