HUANGLIZI / LViT

[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"
MIT License
298 stars 26 forks source link

Question about Pre-training #33

Closed Studentpengyu closed 11 months ago

Studentpengyu commented 11 months ago

Dear zihan,

Thank you for your impressive work!

I have been following your work and I have successfully run the code with the setting 'model_type="LViT"', which does not require a pretrained model. However, I am facing some challenges in understanding how to obtain the pretained model for the setting 'model_type="LViT"'.

As discussed in Section 2.1 of your instructions: image

It appears that a U-Net model might be a prerequistite for the LViT model. Could you please clarifiy if this is the case? If so, does this imply I need to train a U-Net model first before proceeding with 'LViT_pretrain'? Also, in such a scenario, should I change the model type and write the corresponding code here? image

Furthermore, I am also curious about how to load the pretrained U-Net model once it is obtained. Is this U-Net model directly applicable to LViT_pretrain, or are there additional steps or modifications required?

Your guidance on these matters would be greatly appreciated, as it would greatly assist me in understanding and utilizing your work more effectively.

Thank you for your time and consideration. I am looking forward to your valuable insights.

Best regards, Pengyu Zhao

HUANGLIZI commented 11 months ago

Hi Pengyu,

You can use the UNet network code we provide to train a UNet model and use it as a pre trained model. Don't worry, the UNet pretrain model can be directly applied to the pretraining of LViT.

Studentpengyu commented 11 months ago

Hi Zihan,

Thank you for your patient guidance!

Now, I have successfully trained the UNet network on the 'MoNuSeg' task using the code you provided. If I want to used it for the pretraining of LViT to get the results you show here. 1702272960689

Should I rewite the pretrained_UNet_model_path here to import the pretrained weight? 1702273225809

Please don't close the issue for now. I will close it promptly when it's resolved. Thank you again for your professional guidance!

Best regards, Pengyu

HUANGLIZI commented 11 months ago

Yes, you should change the load path to your own path.

Studentpengyu commented 11 months ago

Hi Zihan,

I have changed the load path to my own path as shown here: 1702282835373

The model, _bestmodel-UNet.pth.tar, was obtrained by training the UNet model on the MoNuSeg dataset. I used this pretrained UNet model for the pretraining of LViT. Could you please confirm if this approach is appropriate?

Now, I have got the test results of LViT and LViT w/o pretrain on the MoNuSeg task as shown here: image

Upon evaluating the results of my experiments, I observed a decrease in the performance of the LViT model after loading the UNet pretrained model. This was an unexpected outcome. To understand this issue better, I delved into the training logs. Interestingly, I found that while the convergence speed of LViT was indeed faster with the pretrained model (achieving optimal performance at epoch 89) compared to LViT without pretraining (achieving optimal performance at epoch 130), the validation results told a different story. Specifically, the mean dice score for the LViT model without pretraining was 0.8062, while the pretrained LViT model only achieved a score of 0.8052, indicating a decrease in validation performance.

LViT w/o pretrain training log: image

LViT training log: image

This outcome is puzzling to me, and I'm currently exploring potential reasons for this discrepancy. I am considering whether adjusting the learning rate from 1e-3 to 3e-4 and increasing the early_stopping_patience might lead to better results. I would greatly value your opinion on this approach. Do you think these changes could potentially address the issues I'm encountering? Additionally, are there other modifications you would recommend? Any guidance you can provide would be immensely helpful.

Best regards, Pengyu

HUANGLIZI commented 11 months ago

The optimal performance may be related to the GPU used, and you should search for the optimal hyperparameters based on the platform you are using. And also, you can decrease the learning rate and increase the early stop patience to get a better result.

Studentpengyu commented 11 months ago

Ok, thank you! Your advice has been incredibly helpful, and I believe I now have a good understanding of how to effectively utilize the UNet pretrained model in my experiments.

I am ready to close this issue. Thank you once again for your invaluable support and insights.