farewellthree / STAN

Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"
Apache License 2.0
90 stars 3 forks source link

STAN-LLaVa #20

Closed insafim closed 5 months ago

insafim commented 5 months ago

When will you release the code for STAN-LLaVa?

farewellthree commented 5 months ago

Please refer to BT-Adapter, where we release a similar but more strong model for video conversation.