TIO-IKIM / CellViT

CellViT: Vision Transformers for Precise Cell Segmentation and Classification
https://doi.org/10.1016/j.media.2024.103143
Other
236 stars 41 forks source link

Train with one nuclei type #50

Closed vadori closed 4 months ago

vadori commented 4 months ago

Hi!

Thanks for sharing. I am trying to train the model on a dataset with only one nuclei type. If I set the num_nuclei_classes =2, does the branch for nuclei type prediction get constructed? I will dig into the code, but it would be nice to get an answer/confirmation if you know this already :) Should I delete the branch and anything related to it manually, as suggested in another issue for getting predictions on a single tissue type?

Thank you!!

FabianHoerst commented 4 months ago

Hi!

I think the most straight way would be to remove everything related to the classification head (type prediction). The head is not removed automatically. You need to adapt the model by removing the branch: https://github.com/TIO-IKIM/CellViT/blob/ebcc23c77c84b7d104c1b124ce9f6b71113ae434/models/segmentation/cell_segmentation/cellvit.py#L149

You also need to modify the following files: cell_segmentation/trainer/trainer_cellvit.pyand cell_segmentation/experiments/experiment_cellvit_pannuke.py

vadori commented 4 months ago

Dear Fabian,

thank you for your response, much appreciated!! I actually modified the scripts to manage the one tissue type case and it runs perfectly, not sure I will have time to adapt the code to the one nuclei type case, but it appears the modifications would be similar, possibly a little more involved. Again, thanks for sharing!!

vadori commented 4 months ago

Hi again @FabianHoerst,

Training was running fine until epoch 25, when the encoder gets unfrozen. At this point the loss may become nan in certain epochs until I guess the model is somehow corrupted and the loss is nan all the times. Has this ever happened to you during training? It may be due to the mixed precision, so I am trying now to set mixed precision to False. Any other suggestions? Any response, especially if you faced this same issue, would be greatly appreciated, thank you!! P.S. I am using a pre-trained SAM-H.

vadori commented 4 months ago

Hi again @FabianHoerst,

Training was running fine until epoch 25, when the encoder gets unfrozen. At this point the loss may become nan in certain epochs until I guess the model is somehow corrupted and the loss is nan all the times. Has this ever happened to you during training? It may be due to the mixed precision, so I am trying now to set mixed precision to False. Any other suggestions? Any response, especially if you faced this same issue, would be greatly appreciated, thank you!! P.S. I am using a pre-trained SAM-H.

I solved this issue by disabling the mixed precision. However, I also noticed that after unfreezing the encoder (epoch 25), the performance decreased and then started increasing again. Has this happened to you? It happened to me with another model due to trainable batch normalization layers. Could it be that the cause is the same here with CellViT? Thank you again.

FabianHoerst commented 4 months ago

Thanks for the update! Indeed, I experienced the same behavior, but this seems reasonable to me as the domain shift between the pretraining task and segmentation is somehow large.

I will close the issue now. If you have any other questions, feel free to reach out again!

Best, Fabian