Open eweill opened 7 years ago
@eweill
You need to change the .prototxt file for detecting multi classes. Refer the links below
https://github.com/NVIDIA/caffe/blob/caffe-0.15/examples/kitti/detectnet_network.prototxt https://github.com/NVIDIA/caffe/blob/caffe-0.15/examples/kitti/detectnet_network-2classes.prototxt
It's notoriously difficult to train multi-class DetectNet however in case this helps, I found it easier to first train a single-class network (cars) and then fine-tune on the 2-class network (cars + pedestrians).
@gheinrich Since I have trained both a single class car and pedestrian network, I would like to try and fine-tune on a 2-class network. However, I am unable to use the weights directly due to dimensionality of the classifier layer.
I saved my 1-class car trained network as a "pre-trained model" in DIGITS. I then selected the 1-class model, and customized it by pasting in the 2-class prototxt. After selecting all my other desired parameters, I started the training and received the following error:
ERROR: Cannot copy param 0 weights from layer 'cvg/classifier'; shape mismatch. Source param shape is 1 1024 1 1 (1024); target param shape is 2 1024 1 1 (2048). To learn this layer's parameters from scratch rather than copying from a saved net, rename the layer.
Is there an easy way to convert either the model file or the weights file to "fine-tune" in the manner that you were suggesting or am I thinking about it entirely incorrectly? Thanks.
Why don't you rename cvg/classifier
as suggested in the error message?
@gheinrich Thanks. I realized that as soon as I posted the Error message and got that fixed. I still am unable to train the 2-class network. I simply use a single class network that is trained on cars (with an mAP of 62, so it performs quite well) and try to "fine-tune it on pedestrians as the second class.
I replaced the custom network with a 2-class DetectNet, performed mean subtraction, and set other hyper parameters (learning rate, learning rate policy, batch size/accumulation, etc.) and no matter how I set the hyperparameters, class 2 remains at 0 mAP no matter how many iterations I choose to train the network. Am I missing an important piece to make the network train on the second class as well? Below are some of the current results I am getting.
@eweill Hi there, did you manage to resolve the problem you had mentioned? I am facing the same problem...
@ShervinAr I haven't resolved the problem yet. I have tried many different approaches and no matter how I create the LMDB dataset or train the model (different hyperparameters), I can't seem to get the second class to ever detect (mAP stays at 0 for 60 or 120 epoch).
@eweill My guess is that the problem might have roots in : 1-different learning rates required for learning cars vs. pedestrians 2-insufficiency of training/val images for pedestrian detection: not all images in the training set include pedestrians and I am not sure how the network would react to this phenomenon...
@eweill Hello. I have a similar problem and have a basic question. For multi class detection do I have to change the dataset first? I mean if the dataset label files has 10 class types (car, pedestrian, van, pickup, ...) do I have to delete the lines for van, pickup, ... or just setting the dontcare, car
in the dataset creation in digits is enough?
Hello All,
I used the "detectnet_network-2classes.prototxt" file to train my network on Kitti data for cars and pedestrians. But I am getting the following error while training: << error code -11 Train net output #0: loss_bbox = 1.12773 ( 2 = 2.25547 loss) Train net output #1: loss_coverage = 4.84529 ( 1 = 4.84529 loss) Iteration 1264, lr = 0.0001 Snapshotting to binary proto file snapshot_iter_1276.caffemodel Snapshotting solver state to binary proto file snapshot_iter_1276.solverstate Iteration 1276, Testing net (#0) Ignoring source layer train_data Ignoring source layer train_label Ignoring source layer train_transform
Any suggestions?
Thanks Deepika
Training multi-class network based on DetectNet is not very easy, recently I got some better results. I will share some tips on how to train this type network.
@lbin have you published your tips?
^I am also wondering the same. I am running into similar problems with the other class have 0 mAP throughout the whole training procedure.
@varunvv @lbin @gheinrich
I noticed that on 2-classes prototxt there's these lines:
object_class: { src: 1 dst: 0} # cars -> 0
object_class: { src: 8 dst: 1} # pedestrians -> 1
1-Could you explain the "src" field? The "dst" field seems to be the class index as the comment implies. 2-Since we identify classes as index (0 and 1 for example), how does it matches label names on Kitti format (cars as 0, pedestrian as 1):
Ty so much.
Hello, the src
field needs to be set to the index of your class in the class mappings (see https://github.com/NVIDIA/DIGITS/blob/digits-5.0/digits/extensions/data/objectDetection/README.md#custom-class-mappings).
Hello,
has anyone successfully trained a 2-class detectnet? Can anyone share some tips regarding this topic?
Thank you. AP
Iam getting single class detected.please someone helps me to understand why the 2 class detection not working even after the above network is used ,with kitty dataset.
@lbin Could you share your results? i have the same problem with only one class learning correctly.
@lbin please share the tips on 2 class training.i tried different approaches but failed to learn pedestrian.It is learning the car only.
@lbin Please someone help me to make 2 class detection possible.I have tried many approaches but failed to.
Anyone?
@elaith9 I managed to train two classes on the MS COCO data set. This requires a large amount of time and pre-weighed scales. One of the classes after 300 epochs did not rise above 1 map, the second class 5 map. It looks like the tiny-yolo will work much more reliably. SSD for TensoroRT have not yet been implemented, so there is a chance of trying to succeed with networks such as Mobilenet, SqueezeNet.
@eweill How did you fix this ERROR: Cannot copy param 0 weights from layer 'cvg/classifier'; shape mismatch. Source param shape is 1 1024 1 1 (1024); target param shape is 2 1024 1 1 (2048)?
@adithya-p @sulth @linuzlover @shinaushin Ok, I see a lot of you are struggling with the same problem as I did. I've decided to write two blog posts about it and explain in details how to do custom multiclass object detection with DIGITS. There you go:
https://labs.coria.com/blog/computer-vision/PreparingDataForCustomObjectDetectionUsingNvidiaDigits?sc_camp=33AFA8630062426190B5760C8FDF17CF https://labs.coria.com/en/blog/computer-vision/TrainingACustomMulticlassObjectDetectionModel?sc_camp=33AFA8630062426190B5760C8FDF17CF
If you have any questions, feel free to ask.
@elaith9 Thank you so much but the link died. Can someone share us how to train multiclass detectnet on DIGITS? (may be with KITTI dataset or other dataset)
@dqthebt24 links should be working now. Sorry.
Thank you so much @elaith9
Can DetectNet be used to train models for detecting more than 2 classes? If yes, are the changes to be made similar to the link u posted, @elaith9 ?
@Adithyak1998 Yes, DetectNet can be used to train models for detecting more than 2 classes. The changes are similar to the links posted above. Use diffchecker to get an insight of what's happening.
@elaith9 if you don't mind , can you re-update you links . they doesn't work with me.
I wrote a small article on how to crate a dataset and train a two class DetectNet on Digits. You can read my post here
Hope it helps Cheers Marco
I have been working on multi-class detection for cars and pedestrians using DetectNet, as mentioned in MCOD. I can successfully train a single-class DetectNet model for cars and pedestrians (separately); however I am unable to train the 2 class network correctly. Note that when I try to train the pedestrian network on KITTI, I am achieving great classification but the mAP value does not reflect that (it hovers around 10 while I can get upwards of 55 for cars).
With that said, when I try to use the 2 class detect net with a dataset created from KITTI using dontcare,car,pedestrian it only seems to detect cars no pedestrians at all (i.e. the mAP for class 1 remains 0 for the entire training process). I would like to perform inference on this, however with an mAP of 0, pedestrians aren't detected when using inference.
Can anyone point me in the right direction when training the 2-class network?