Open NielsRogge opened 6 months ago
Hello @NielsRogge, Thanks for the issue. Currently, there are difficulties in loading a state dict for SlimSAM, primarily due to the use of global pruning in the model compression process. This technique results in altered intermediate dimensions for each vit within the encoder. To facilitate easier state dict loading, we anticipate the release of SlimSAM versions utilizing local uniform pruning by next week. Additionally, the q, k, v weights will be recombined into a single matrix in the new SlimSAMs. We will notify you once released. Thanks!
Awesome, looking forward!
Hello @NielsRogge, Here are local pruning SlimSAM models:
SlimSAM-50-uniform
: SlimSAM-50 model.SlimSAM-77-uniform
: SlimSAM-77 model.Above models can be instantiated by running
import torch
from segment_anything import sam_model_registry
model_type = 'vit_p50'
checkpoint = 'checkpoints/SlimSAM-50-uniform.pth'
SlimSAM_model = sam_model_registry[model_type](checkpoint=checkpoint)
SlimSAM_model.to(device)
SlimSAM_model.eval()
Very cool, just converted and pushed the checkpoints:
One can use them as explained in the Hugging Face docs.
Would you be interested in transferring these checkpoints to your account/the University of Singapore organization on the Hugging Face hub? Cause currently they are part of my account (nielsr)
Also we can add some nice model cards (READMEs)
Sure, thanks for your help! https://huggingface.co/Zigeng/SlimSAM-uniform-50 https://huggingface.co/Zigeng/SlimSAM-uniform-77
Hi,
Can we plug in the SlimSAM weights into SAM (by recombining the q, k, v weights into a single matrix per layer)?
If yes, then SlimSAM could be ported easily to the 🤗 hub. Currently I'm getting errors like:
It seems that the dimensions are different per layer based on the pruning. Any way to load such a state dict in PyTorch?