multimodal-llm Search Results

883 results
for multimodal-llm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

guinmoon/llmfarm_core.swift #25

Multimodal (Visual) LLM

Hi it looks like you updated the API a little bit in this commit https://github.com/guinmoon/llmfarm_core.swift/commit/e4e8aa7617e2e86af434677cc4196462a0005ea9 Would you mind giving an updated w…

dlazares updated 1 month ago
1
vocodedev/vocode-core #660

[Feature]: Multimodal LLM support

### Brief Description Obviously the end-game here are multimodal LLMs instead of using a cascaded approach. But we are not quite there yet. There are however interesting options that are multimoda…

petergerten updated 1 month ago
2
run-llama/llama_index #13606

[Question]: Run multimodal LLM locally

### Question Validation - [X] I have searched both the documentation and discord for an answer. ### Question I am Referring to this example: https://www.llamaindex.ai/blog/multimodal-rag-for-advanc…

tsantra updated 15 hours ago
14
huggingface/candle #1947

Support multimodal LLMs?

Do you have any plans to support multimodal LLMs, such as MiniGPT-4/MiniGPT v2 (https://github.com/Vision-CAIR/MiniGPT-4/) and LLaVA (https://github.com/haotian-liu/LLaVA/)? That would be a significan…

guoqingbao updated 1 month ago
7
bentoml/OpenLLM #972

feat: Multimodal LLMs?

### Feature request Is it possible to rum multimodal LLMs like Qwen VL or LLaVa 1.5 using openllm? ### Motivation _No response_ ### Other _No response_

Trickster85 updated 1 month ago
1
DeabLabs/cannoli #52

How to disable the vision model?

In articles with images, I do not want LLM to recognize the images within, or I want a LLM without multimodal capabilities to read the pure text of the notes directly without triggering the vision mod…

Hwenyi updated 12 hours ago
1
NVIDIA/TensorRT-LLM #2104

Does TensorRT-LLM support passing input_embeds directly？

For multimodal models, we usually need to combine visual features and input_embeds as final input_embeds and send them to the model for inference. Currently, this combination method may be different …

Hukongtao updated 1 day ago
16
huggingface/transformers #33113

Add multi image prompts to multimodal LLMs that support it (…

### Feature request Adding the ability to pass many images per prompt to PaliGemma. This would mean, among other changes, to change the argument type of `images` on PaliGemmaProcessor to allow array[…

Stani-s updated 6 days ago
4
johnnyhwu/Awesome-LLM-Tabular #5

Add a new paper

The following article might also be a great read related to the topic of whether LLMs understand tabular data: ["Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Represe…

manhcuong17072002 updated 3 weeks ago
1
NVIDIA/TensorRT-LLM #2144

Does the current tensorrt-llm support two images as input？

I am using the llava-onevision model (https://llava-vl.github.io/blog/2024-08-05-llava-onevision/), which can accept two images as input and then ask questions about the two images. Does the current …

AmazDeng updated 1 day ago
4

上一页 1...1 2 3 4 5 6 7...89 下一页

883 results for multimodal-llm

883 results
for multimodal-llm