OpenGVLab / VideoMamba

VideoMamba: State Space Model for Efficient Video Understanding
https://arxiv.org/abs/2403.06977
Apache License 2.0
660 stars 47 forks source link

Integrate Hugging Face Hub capabilities with PyTorchModelHubMixin #16

Closed NielsRogge closed 2 months ago

NielsRogge commented 3 months ago

Dear authors,

I noticed your awesome repository trending on GitHub and wanted to introduce you to a new feature from Hugging Face that can enhance your model sharing and collaboration experience. Hugging Face's huggingface_hub library now offers the PyTorchModelHubMixin class, which allows you to easily add functionalities like from_pretrained, save_pretrained, and push_to_hub to your custom PyTorch models. This means you can seamlessly share your models on the Hugging Face Hub, making them readily available for others to use and experiment with.

Here's how you can integrate PyTorchModelHubMixin into your project:

Step 1: Install the huggingface_hub library:

pip install huggingface_hub

Step 2: Modify your model class:

Here's an example with your VisionMamba model:

import torch
import torch.nn as nn
from huggingface_hub import PyTorchModelHubMixin

class VisionMamba(
    nn.Module,
    PyTorchModelHubMixin,
    library_name="VideoMamba",  # Optional
    repo_url="https://github.com/OpenGVLab/VideoMamba",  # Optional
    docs_url="https://github.com/OpenGVLab/VideoMamba#readme",  # Optional
):
    # ... Your model code ...

# ... Rest of your code ...

Step 3: Save and share your model:

Use the newly available methods:

This not only simplifies model sharing but also provides you with:

By integrating PyTorchModelHubMixin, you can leverage the Hugging Face Hub's vibrant community and resources to reach a wider audience and facilitate collaboration. Feel free to explore the documentation for more details.

I believe this addition can significantly benefit your project and encourage further engagement with your work.

Let me know if you have any questions or need assistance with the integration!

Andy1621 commented 3 months ago

Thanks for your great suggestion!

From your information, it seems that the PytorchModelHubMixin is a more elegant way to create a model card and upload the weights.

However, I'm not sure how should I use this property now, since I have uploaded the model weights manually here.

NielsRogge commented 3 months ago

Hi @Andy1621, thanks for your reply.

Actually we recommend to upload each checkpoint to a separate repository, so that you have download metrics/model card/discoverability per model. You can then create a collection which groups together repositories, like this one: https://huggingface.co/collections/llava-hf/llava-next-65f75c4afac77fd37dbbe6cf.

Hence, would it be possible to leverage PytorchModelHubMixin and push the checkpoints to individual repos? Let me know if you need any help.

Kind regards,

Niels

NielsRogge commented 3 months ago

I've pushed some updates as it can also be added to the VisionTransformer class besides VisionMamba. The tags can differ depending on the model.

Also note that we recently added support to make arbitrary repositories official on the hub. This can be done by opening a PR on this file. This means that a "use with VideoMamba" button would appear on the checkpoints, similar to the "Use with Transformers"

Andy1621 commented 3 months ago

Hi! I'm willing to merge the PR, but can you help me split the previous model weights into individual ones?

BTW, the video models we used are videomamba.py, videomamba_pretrain.py and umt_videomamba.py. And the image model we used are videomamba.py and videomamba_distill.py.

NielsRogge commented 3 months ago

Hi,

I don't have write access to the OpenGVLab organization on the šŸ¤— hub, would you be interested in trying it out yourself?

Basically all that needs to be done is (to give one example):

from videomamba.video_sm.models.videomamba import videomamba_middle

# assuming the MODEL_PATH environment variable is set
model = videomamba_middle(pretrained=True)

model.push_to_hub("OpenGVLab/video-mamba-middle)

This will then push the weights of this checkpoint to the hub, making sure download metrics work.

Andy1621 commented 3 months ago

Hi! We can give you the write access. What about your hf id :hugs:?

NielsRogge commented 3 months ago

It's nielsr

Andy1621 commented 3 months ago

Great! We have invited you at hf :hugs:!

NielsRogge commented 3 months ago

Thanks! Can this model be loaded on Google Colab? I'm installing the necessary requirements (I use runtime=A100 GPU).

However it fails with ModuleNotFoundError: No module named 'mamba_ssm'.

Could you take a look? https://colab.research.google.com/drive/1Goifa_W6x9FZyAjJaFtDn8BBnh_SUDod?usp=sharing

Andy1621 commented 3 months ago

Hi! It seems that the mamba-ssm is not installed correctly. In the hf space, I have tried to install it with pre-complied packages as in file.

NielsRogge commented 2 months ago

Hi @Andy1621 when loading the VideoMamba models on Google Colab I'm getting this issue:

[/usr/local/lib/python3.10/dist-packages/causal_conv1d/causal_conv1d_interface.py](https://localhost:8080/#) in <module>
      5 
      6 
----> 7 import causal_conv1d_cuda
      8 
      9 

ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory

Do you know how to resolve this issue?

Installing PyTorch 2.1 doesn't resolve it either:

----> 7 import causal_conv1d_cuda
      8 
      9 

ImportError: /usr/local/lib/python3.10/dist-packages/causal_conv1d_cuda.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN2at4_ops10zeros_like4callERKNS_6TensorESt8optionalIN3c1010ScalarTypeEES5_INS6_6LayoutEES5_INS6_6DeviceEES5_IbES5_INS6_12MemoryFormatEE
Andy1621 commented 2 months ago

Hi! Since I do not change the code of casual-conv, it may be helpful to ask the authors for help.

NielsRogge commented 2 months ago

I'm not installing the package from their repo, I've only installed the custom Mamba package present in the videomamba repository.

Could you clarify how to run the videomamba models on Google Colab as done in this notebook? This would allow me to push all the checkpoints to the hub.

I see that on the šŸ¤— Space, the custom wheels are installed, so maybe we need something similar on Colab.

Andy1621 commented 2 months ago

Hi! Since I'm not familiar with how Google Colab runs, it's hard for me to give a reasonable solution šŸ˜­.