A question about using Train_LapIRN_diff.py

MoXiaoYing96 commented 3 years ago

Hi Tony! I am trying to solve a multi-modal registration question, I followed the 2020 Learn2Reg and really intrested in your method beacause your method on multi-modal registration have reached state-of-art effect.

So, I am trying to train a model by using _Train_LapIRNdiff.py. The training was broken from the train_lvl3 where load model, the file _../Model/LDR_OASIS_NCC_unit_add_reg_35_stagelvl230000.pth and _../Model/lossLDR_OASIS_NCC_unit_add_reg_3_anti_1_stagelvl310000.npy didn't shared to us. I replaced them to the model came frome train_lvl2 of my training, but it doesn't work because some keys is unexpected for state_dict. I am really confused.

Will you share the files? Or how to solve this problem? Your warm answear is helping too much !

MoXiaoYing96 commented 3 years ago

Sorry, I entered the wrong file name. The file names of your code are "_../Model/LDR_OASIS_NCC_unit_add_reg_3_anti_1_stagelvl310000.pth" and then I replace them to my training files "_../Model/LDR_OASIS_NCC_unit_add_reg_35_stagelvl230000.pth".

Hope it does not affect your reading and understanding.

cwmok commented 3 years ago

@MoXiaoYing96

This is actually a fatal bug in the progressive training scheme. I commented some codes for testing and forgot to add them back to my Github repository. Basically, I missed the following code (which should be inserted in between model_lvl2 and model in line279: model_path = sorted(glob.glob("../Model/Stage/" + model_name + "stagelvl2_?????.pth"))[-1] model_lvl2.load_state_dict(torch.load(model_path)) print("Loading weight for model_lvl2...", model_path)

`# Freeze model_lvl1 weight
for param in model_lvl2.parameters():
    param.requires_grad = False`

This is the exact code in the Train_LapIRN_disp.py as well.

The model path you are modifying is actually for training pause and resume. By default, load_model = False therefore it will not require the model weight file "LDR_OASIS_NCC_unit_add_reg_3_anti_1_stagelvl3_10000.pth"

I have fixed this issue and updated the Github code. Thanks for reporting the issue. :D

cwmok commented 3 years ago

@MoXiaoYing96

Also, if you have finished the training on lvl1 and lvl2, you could just comment out the train_lvl1() and train_lvl2() at the bottom of Train_LapIRN_diff.py, so that the training will skip the lvl1 and lvl2. It will automatically load your pre-trained weight for lvl1 and lvl2 and save you some time for training.

MoXiaoYing96 commented 3 years ago

Thanks a lot! Your answer solved my problem perfectly! Wish u have a good day~ :)

tphankr commented 2 years ago

@MoXiaoYing96

Also, if you have finished the training on lvl1 and lvl2, you could just comment out the train_lvl1() and train_lvl2() at the bottom of Train_LapIRN_diff.py, so that the training will skip the lvl1 and lvl2. It will automatically load your pre-trained weight for lvl1 and lvl2 and save you some time for training.

Thank you very much, @MoXiaoYing96 @cwmok and @wingwing518, Can you explain more if we set the "load_model=True" that will be used the weight again, please?.

Question: What does it mean? I mean, Are there any differences when we reuse weights? For example, when we reuse weights, it is good or not good for our performance of the Image Registration problem. Thank you very much.

cwmok commented 2 years ago

@tphankr The load_model at line 320 in Train_LapIRN.py is primary for resuming training only, and by default, you should always set it to False when you are training your model from scratch. The term "weight resue" you have mentioned would be better contextualized in transfer learning/finetuning in deep learning, which may not apply here.

While fine-tuning from a pre-trained model may help to alleviate the over-fitting issue in many computer vision tasks, the pre-trained model should be trained from a large-scale dataset.

Yet, whether finetuning improves the performance of the image registration network or not, is still an ongoing discussion. Perhaps you can try that out and let me know.

Furthermore, the model loading in line 281 is a totally different case. It is for progressive training and has nothing to do with "weight resue" or "finetuning".

tphankr commented 2 years ago

Thank you @cwmok very much. I clearly understand( will setting it all "load_model=False")

"Yet, whether fine tuning improves the performance of the image registration network or not, is still an ongoing discussion. Perhaps you can try that out and let me know."

I will try and report, thank you for your help @cwmok.

cwmok / LapIRN

A question about using Train_LapIRN_diff.py #5