TIO-IKIM / CellViT

CellViT: Vision Transformers for Precise Cell Segmentation and Classification
https://doi.org/10.1016/j.media.2024.103143
Other
236 stars 41 forks source link

Do you find the training unstable for ViT-S based encoders? #53

Closed swarajnanda2021 closed 3 months ago

swarajnanda2021 commented 3 months ago

I've experienced nan issues with ViT-S encoders, but when I change the DeConv2d params from:

        self.bottleneck_dim = 312

to:

        self.bottleneck_dim = 256

This issue is somewhat ameliorated.

Could there be an issue with dimensioning the decoder blocks for embedding dimensions less than 512?

FabianHoerst commented 3 months ago

Hello, I did not experience instabilities during training for the ViT-S encoders. However, as I am not able to reproduce, the error might arise from mixed-precision training, wrong input scale, differing package versions etc.

I am very sorry that I cannot help you. If switching the dimension works for you and the training is more stable, I would recommend then to use your version.

swarajnanda2021 commented 3 months ago

Hello, switching to full precision stabilized the training.

On Mon, 12 Aug 2024 at 01:35, Fabian Hörst @.***> wrote:

Hello, I did not experience instabilities during training for the ViT-S encoders. However, as I am not able to reproduce, the error might arise from mixed-precision training, wrong input scale, differing package versions etc.

I am very sorry that I cannot help you. If switching the dimension works for you and the training is more stable, I would recommend then to use your version.

— Reply to this email directly, view it on GitHub https://github.com/TIO-IKIM/CellViT/issues/53#issuecomment-2283142410, or unsubscribe https://github.com/notifications/unsubscribe-auth/AULVAWV6QX2RZ4S4G6F45GDZRBCRDAVCNFSM6AAAAABMI7W6GCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEOBTGE2DENBRGA . You are receiving this because you authored the thread.Message ID: @.***>