-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I want to build a multimodal chat using streamlit and llamaindex workflows, wherein user…
-
### Is your feature request related to a problem? Please describe.
In version 20.11.0 ALVR added Multimodal tracking support, allowing fingers to be tracked while holding the controllers (both fing…
-
### 🚀 The feature, motivation and pitch
vllm >= 0.6 doesn't support out tree defined multimodal models:
```
rank0]: File "/usr/local/lib/python3.8/dist-packages/vllm/entrypoints/llm.py", line 177…
-
### Brief Description
Obviously the end-game here are multimodal LLMs instead of using a cascaded approach. But we are not quite there yet.
There are however interesting options that are multimoda…
-
Now it is only possible to pass text to enforced model. Some tasks require multimodal models, like processing images. It would be awesome if it would be possible to enforce a multimodal model to use a…
-
### What do you need?
Add support for input of images, audio, video
-
# [24’ CVPR] AnyRef: Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception - Blog by rubatoyeong
Find Directions
[https://rubato-yeong.github.io/multimodal/anyref/](https://rubato-…
-
## 概要
人間は何かを説明する時にあらゆる五感情報からそれに対する抽象的なイメージを頭に浮かべて説明している。
マルチモーダルな入力においてはそれが可能。
出力に関してもマルチモーダルに振る舞うことがわかる。
それらが全てわかるようなデモとなっている。
## ポイント
- テキストだけでは不十分な情報を補って説明することができる。
- 言語化したものを画像や音声など…
-
Hi, guys
since you guys already implement vendor_multimodal_api_key, vendor_multimodal_model_name, would you please add a new paramter vendor_multimodal_api_base.
This is very useful for those who…
-