Support for ViT-L and ViT-H Image Encoder Checkpoints

MathieuNlp / Sam_LoRA

Segment Your Ring (SYR) - Segment Anything model adapted with LoRA to segment rings.

MIT License

71 stars 12 forks source link

Support for ViT-L and ViT-H Image Encoder Checkpoints #7

Open Berlin000000 opened 4 months ago

Berlin000000 commented 4 months ago

I have noticed that the current implementation only supports the image encoder checkpoint for ViT-B. Could you please clarify why only the ViT-B checkpoint is supported? Additionally, if I want to use the corresponding checkpoints for larger models, such as ViT-L (Large) or ViT-H (Huge), what steps should I take to implement this support? Are there specific modifications or considerations required to extend compatibility to these larger models?

MathieuNlp commented 3 months ago

Hello,

In this file, we are loading the vit_b model with the loader from SAM: https://github.com/MathieuNlp/Sam_LoRA/blob/main/train.py#L30

If you want to load another model use the other sam builder here: https://github.com/MathieuNlp/Sam_LoRA/blob/main/src/segment_anything/build_sam.py

Hope it answers your question.