-
I admire and am interested in your work and would like to follow up on your work. Will you make the pre-training code and training dataset public?
-
**Describe the bug**
Image in PDF was extracted as different sub-image files instead of single figure it should be
**Files**
[mattergen.pdf](https://github.com/user-attachments/files/16831945/mat…
-
## タイトル: 効率的なマルチモーダル大規模言語モデル:サーベイ
## リンク: https://arxiv.org/abs/2405.10739
## 概要:
過去1年間で、マルチモーダル大規模言語モデル(MLLM)は、視覚質問応答、視覚理解、推論などのタスクにおいて目覚ましい性能を示してきました。しかし、モデルサイズが大きく、トレーニングと推論のコストが高いことが、産業界や学術界にお…
-
### What happened?
I’m experiencing an issue when using litellm proxy to communicate with the qwen-vl-plus model for multimodal interactions. When I send an image URL directly to qwen-vl-plus, it pr…
-
### Reminder
- [X] I have read the README and searched the existing issues.
### System Info
None
### Reproduction
None
### Expected behavior
None
### Others
Hi, Thank you for the fantastic wo…
-
With all the growing activity and focus on multimodal models is this library restricted to tune text only LLM?
Do we plan to have Vision or more in general multimodal models tuning support?
bhack updated
2 weeks ago
-
Hi,
I saved the LLAVA model in 4bit using generate.py from:
https://github.com/intel-analytics/ipex-llm/tree/main/python/llm/example/CPU/PyTorch-Models/Model/llava
model = optimize_model(model) …
-
# URL
- https://arxiv.org/abs/2306.17842
# Affiliations
- Lijun Yu, N/A
- Yong Cheng, N/A
- Zhiruo Wang, N/A
- Vivek Kumar, N/A
- Wolfgang Macherey, N/A
- Yanping Huang, N/A
- David A. R…
-
**What would you like to be added/modified**:
Based on existing datasets, the issue aims to build a benchmark for domain-specific large models on KubeEdge-Ianvs. Namely, it aims to help all Edge AI a…
-
Papers that don't fit somewhere else right now but may be relevant in the future:
https://huggingface.co/papers/2409.18943
https://arxiv.org/pdf/2409.16493