Open Jped opened 6 years ago
I have managed to get it to run for a while before it crashed by lowering the batches (16) and subdivisions (8). It still crashes after a few thousand steps. The loss and the average loss does not move from nan. According to @AlexeyAB comment here this might be an issue with the labels. I have created label txt files for each image where I just put the class in each corresponding text file. I still get the nan issue.
I also tried training on cifar-10 based on the tutorial here. Get the same issue.
In his website, pjreddie shows how to train on imagenet here. It is not too clear how his train and valid files look like because he has them being automatically created. I assumed that they are just a list of locations where the train and validate images are located as he suggested and that the labels and names are files containing the different labels and names.
However when I try to train on the classifier by using
darknet.exe classifier train data/imagenet.data cococfg/imagenetpretrain.cfg
The program starts to run and then it crashes.Here is the error
Any word why this is happening?