med-air / Endo-FM

[MICCAI'23] Foundation Model for Endoscopy Video Analysis via Large-scale Self-supervised Pre-train
Apache License 2.0
146 stars 14 forks source link

Problems during training #13

Closed licaizi closed 2 months ago

licaizi commented 5 months ago

Thanks for your excellent work, I'm curious about the settings in the script "train_clips32k.sh", how am I supposed to set them if I wanna reach comparable performance reported in the paper, such as "arch", "batch_size_per_gpu", "opts".

One more thing, the lr is weird since it is too small during training, what should i do to make it normal? thanks a lot for your work again!

image

Kyfafyd commented 5 months ago

Hi, @licaizi thanks foy your interest! We use vit-b/16 architecture, wth batch_size 4 (for 24G GPU). You may not need to modify anything here but only run train_clips32k.sh. The learning rate is using warmup in the begining, so it is small.

licaizi commented 5 months ago

@Kyfafyd got it, very grateful for your reply!