-
Thanks for your great work!
In your project, the caption branch is trained only on VG data. This caption ability may be poor than the modal using large caption data and large language model. Have you…
-
### Discussion
### LLaVA-Med V1.6: Training a Large Language-and-Vision Assistant for Biomedicine in Two and Half Hours
#### Abstract
Large Language Models (LLMs) have revolutionized natural la…
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
### Describe the bug
1.session length长度不一致,…
-
Hello!
This is a truly remarkable contribution to the open-source MLLM community! I have a couple of questions regarding Cambrian:
1. The paper suggests that unfreezing the vision encoder could …