-
🙂🙏 感谢开源!
我用自己的数据训练之后效果还差了,帮忙看看什么问题呢,感谢先。
**1. 训练数据**
我的数据是一行一行的图片,然后合成了一张,多行(2~10行随机),共有1万张合成图片,图片是灰度图。
![output_document_1](https://github.com/user-attachments/assets/a266c966-6476-449…
-
First thanks for your great job!
Now We're trying to replace the vision encoder in llava, i.e., clip-l-336, with RADIO. Under the default LLaVA 1.5 settings, we pretrain a multimodal projection MLP a…
-
the original LLaMA max sequence length is 2048 but why is it that the finetuning.sh script uses 512 as the max sequence length?
is it for efficiency reasons since the alpaca dataset doesn't exceed…
-
[paper](https://arxiv.org/abs/2311.04257)
## TL;DR
- **I read this because.. :** very recent VLM model
- **task :** VLM + LLM
- **problem :** multi-modal task는 LLM freeze 시키고 사실상 V+L을 잘하려고…
-
I wonder where is the instruction/finetuning data such that you can use to tune LLM?
-
Hey LAVIS team, thanks for all your work on the BLIP series and all your open source code. 🙌
I just wanted to share that I've created a small project to allow multimodal inference of InstructBLIP …
-
Example false evaluation:
```
'question_id': '20128',
'q_type_id': 2,
'question': 'What is visible in the image besides the sea?'
'gt': 'Building'
'prediction': 'Trees'
'open_prediction': In t…
-
### Model description
LaVIN is a vision-language instructed model that is affordable to train (it was trained in a few hours on 8 A100 GPUs) with good performance on ScienceQA.
I'd like to add …
-
Hi, Thanks for the amazing code. I wonder when you plan to release the code for fine-tuning the V2? Also, do you plan to add Falcon fine-tuning?
Thanks
-
- [ ] [LLM-Agents-Papers/README.md at main · AGI-Edgerunners/LLM-Agents-Papers](https://github.com/AGI-Edgerunners/LLM-Agents-Papers/blob/main/README.md?plain=1)
# LLM-Agents-Papers
## :writing_hand…