facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Apache License 2.0
12.7k stars 1.19k forks source link

Plans for Parameter Efficient Fine Tuning (PEFT)? #358

Open 25benjaminli opened 1 month ago

25benjaminli commented 1 month ago

For instance, adding LoRA to image encoder? Here is a repository that I made that attempts to use LoRA on the attention in the image encoder although I didn't find significant performance gains. Would appreciate feedback regarding the validity of the approach.

25benjaminli commented 1 month ago

To be clear, my implementation for SAM2 is similar to the version found for the original segment anything. Instead of targeting the queries and values for the regular attention blocks as did in SAM1, I targeted the MultiScaleBlock q and v. Are there any additional modifications that need to be made?

iamwangyabin commented 1 month ago

Hi, good ideas, but I want to know the memory usage of training with LoRA?

pcarre-lab commented 1 month ago

@25benjaminli
would you be able to share the LoRA code for SAM2? Did you find significative gains training a custom dataset?

25benjaminli commented 1 month ago

@pcarre-lab here the code is: https://github.com/25benjaminli/sam2lora

To be completely honest, I didn't find much difference fine tuning this version. The training speed seemed just about the same and so did the segmentation performance, not sure if this is a product of a bug in my implementation or if LoRA doesn't work well with Hiera.

EricLina commented 1 week ago

I've tried lora sam2 with peft, it will reduce about 20% Memory consumption with base size model (15474M to 12070M).

EricLina commented 1 week ago

But I am not sure if the peft package is suitable to sam2.