CanPeng123 / Faster-ILOD

45 stars 7 forks source link

In traing file occurs some error #4

Closed eddie000000 closed 2 years ago

eddie000000 commented 3 years ago

here is my environment. I run it on colab. torch 1.1.0 torchvision 0.3.0
I run this code python /content/drive/MyDrive/FA/Faster-ILOD/tools/train_first_step.py --config-file "/content/drive/MyDrive/FA/Faster-ILOD/configs/e2e_faster_rcnn_R_50_C4_1x.yaml" then I met this problem 23

hope you can help me. Thanks.

eddie000000 commented 3 years ago
NUM_CLASSES
NAME_OLD_CLASSES
NAME_NEW_CLASSES
NAME_EXCLUDED_CLASSES

I have already modified these files

CanPeng123 commented 3 years ago

Hi,

I remember last time some people also met this problem and it seems like it is due to the environment version. Could you please change the loss_dict to loss_dict[0] and try? Hope this could help you.

Warm Regards Can

eddie000000 commented 3 years ago

Hi,

I remember last time some people also met this problem and it seems like it is due to the environment version. Could you please change the loss_dict to loss_dict[0] and try? Hope this could help you.

Warm Regards Can

Thanks a lot. It works.

eddie000000 commented 3 years ago

@CanPeng123 here is my e2e_faster_rcnn_R_50_C4_1x_Source_model.yaml image

and here is my e2e_faster_rcnn_R_50_C4_1x_Target_model.yaml image

First, I trained 15 classes. I want to train 15+1+1+1+1+1 classes. But I got this result in an incremental step. I add a new class (pottedplant). image

could you explain how to solve this problem?

eddie000000 commented 3 years ago

After running Faster-ILOD/tools/train_first_step.py There are only these files in the dir image

Therefore, I change the weight in e2e_faster_rcnn_R_50_C4_1x_Target_model.yaml image

CanPeng123 commented 3 years ago

Hi,

Have you removed the optimization and iteration information in the pth file for the pre-trained model? Except for the model parameters, information about optimizer and iteration are all stored on the pth file. If you directly load the target model with the previously trained source model pth, the iteration information on it will be load to the target model. Then, the target model will not be trained since the required iteration steps have been matched.

eddie000000 commented 3 years ago

Hi,

Have you removed the optimization and iteration information in the pth file for the pre-trained model? Except for the model parameters, information about optimizer and iteration are all stored on the pth file. If you directly load the target model with the previously trained source model pth, the iteration information on it will be load to the target model. Then, the target model will not be trained since the required iteration steps have been matched.

After I ran the train_first_step.py and I only get model_final.pth I did not remove any files in the RPN_15_classes_40k_steps folder.

I can't find any pth file named model_trim_optimizer_iteration.pth

CanPeng123 commented 3 years ago

Hi,

You need to remove the information about optimizer and iteration on the model_final.pth before loading it to the target model. Please take a look at the trim_detectron_model.py file.

eddie000000 commented 3 years ago

thanks for your help. I'll give it a try.

YuQianzi commented 1 year ago

@CanPeng123 Could you please give a complete training step like: https://github.com/JosephKJ/iOD/blob/main/run.sh#L1-L8.

YuQianzi commented 1 year ago

@CanPeng123 I still feel confused about how to train the model after I execute python tools/train_first_step.py --config-file ./configs/e2e_faster_rcnn_R_50_C4_1x.yaml