How to train Visual Grounding only without including the function of audio?

magic-research / bubogpt

BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs

https://bubo-gpt.github.io/

BSD 3-Clause "New" or "Revised" License

503 stars 35 forks source link

How to train Visual Grounding only without including the function of audio? #15

Open becauseofAI opened 1 year ago

becauseofAI commented 1 year ago

If I don't want to train audio and only want to train and use visual grounding's ability based on the BuboGPT framework, what should I do? It would be great if providing step-by-step guidance.