Closed Studentpengyu closed 11 months ago
Hi Pengyu,
You can use the UNet network code we provide to train a UNet model and use it as a pre trained model. Don't worry, the UNet pretrain model can be directly applied to the pretraining of LViT.
Hi Zihan,
Thank you for your patient guidance!
Now, I have successfully trained the UNet network on the 'MoNuSeg' task using the code you provided. If I want to used it for the pretraining of LViT to get the results you show here.
Should I rewite the pretrained_UNet_model_path here to import the pretrained weight?
Please don't close the issue for now. I will close it promptly when it's resolved. Thank you again for your professional guidance!
Best regards, Pengyu
Yes, you should change the load path to your own path.
Hi Zihan,
I have changed the load path to my own path as shown here:
The model, _bestmodel-UNet.pth.tar, was obtrained by training the UNet model on the MoNuSeg dataset. I used this pretrained UNet model for the pretraining of LViT. Could you please confirm if this approach is appropriate?
Now, I have got the test results of LViT and LViT w/o pretrain on the MoNuSeg task as shown here:
Upon evaluating the results of my experiments, I observed a decrease in the performance of the LViT model after loading the UNet pretrained model. This was an unexpected outcome. To understand this issue better, I delved into the training logs. Interestingly, I found that while the convergence speed of LViT was indeed faster with the pretrained model (achieving optimal performance at epoch 89) compared to LViT without pretraining (achieving optimal performance at epoch 130), the validation results told a different story. Specifically, the mean dice score for the LViT model without pretraining was 0.8062, while the pretrained LViT model only achieved a score of 0.8052, indicating a decrease in validation performance.
LViT w/o pretrain training log:
LViT training log:
This outcome is puzzling to me, and I'm currently exploring potential reasons for this discrepancy. I am considering whether adjusting the learning rate from 1e-3 to 3e-4 and increasing the early_stopping_patience might lead to better results. I would greatly value your opinion on this approach. Do you think these changes could potentially address the issues I'm encountering? Additionally, are there other modifications you would recommend? Any guidance you can provide would be immensely helpful.
Best regards, Pengyu
The optimal performance may be related to the GPU used, and you should search for the optimal hyperparameters based on the platform you are using. And also, you can decrease the learning rate and increase the early stop patience to get a better result.
Ok, thank you! Your advice has been incredibly helpful, and I believe I now have a good understanding of how to effectively utilize the UNet pretrained model in my experiments.
I am ready to close this issue. Thank you once again for your invaluable support and insights.
Dear zihan,
Thank you for your impressive work!
I have been following your work and I have successfully run the code with the setting 'model_type="LViT"', which does not require a pretrained model. However, I am facing some challenges in understanding how to obtain the pretained model for the setting 'model_type="LViT"'.
As discussed in Section 2.1 of your instructions:
It appears that a U-Net model might be a prerequistite for the LViT model. Could you please clarifiy if this is the case? If so, does this imply I need to train a U-Net model first before proceeding with 'LViT_pretrain'? Also, in such a scenario, should I change the model type and write the corresponding code here?
Furthermore, I am also curious about how to load the pretrained U-Net model once it is obtained. Is this U-Net model directly applicable to LViT_pretrain, or are there additional steps or modifications required?
Your guidance on these matters would be greatly appreciated, as it would greatly assist me in understanding and utilizing your work more effectively.
Thank you for your time and consideration. I am looking forward to your valuable insights.
Best regards, Pengyu Zhao