We are trying to use the InternVideo2 multi_modality model, but right now it's quite painful, we have to git clone the repo, fix the imports, and use a custom pyproject.toml to pip install it.
This change makes it a little easier. Once merged, you can run:
We are trying to use the InternVideo2 multi_modality model, but right now it's quite painful, we have to git clone the repo, fix the imports, and use a custom
pyproject.toml
to pip install it.This change makes it a little easier. Once merged, you can run:
to install it. For the two
flash_attn
modules, provide more pip workers: