hustvl / Vim

[ICML 2024] Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Apache License 2.0
2.56k stars 160 forks source link

Add integration with Hugging Face #7

Closed NielsRogge closed 4 months ago

NielsRogge commented 5 months ago

Hi,

Thanks for this nice work. I wrote a quick PoC to showcase that you can easily have integration so that you can automatically load the various VisionMamba models using from_pretrained (and push them using push_to_hub), track download numbers for your models (similar to models in the Transformers library), and have nice model cards on a per-model basis. It leverages the PyTorchModelHubMixin class which allows to inherits these methods.

Usage is as follows:

from vim.models_mamba import VisionMambaForImageClassification
from PIL import Image
from torchvision.transforms import Compose, Resize, ToTensor, Normalize

model = VisionMambaForImageClassification.from_pretrained("nielsr/mamba-vision-tiny")

transform = compose([
     Resize(224),
     ToTensor(),
     Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), 
)

image = Image.open(...)
inputs = transform(image).unsqueeze(0)
logits = model(inputs)

We could move all checkpoints to separate repos on the HUSTVL organization if you're interested.

I saw that currently download numbers won't be tracked since they don't leverage this integration.

Would you be interested in this integration?

Kind regards,

Niels

165412152 commented 5 months ago

Hello, could you please explain why the line 'self.mixer = mixer_cls(dim)' is causing a TypeError: Mamba.init() got an unexpected keyword argument 'bimamba_type'?

NielsRogge commented 5 months ago

Hi @LegendBC @xinggangw @JingfengYao would you be interested in this?

Johnny-Haytham commented 1 month ago

answer of q “why the line 'self.mixer = mixer_cls(dim)' is causing a TypeError: Mamba.init() got an unexpected keyword argument 'bimamba_type'”

Vim rewrite THE CLASS Mamba in mamba_ssm,so update your simple mamba.py from VIM offical code