multimodal Search Results

1000+ results
for multimodal

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

derdide/telegram_multi_llm_bot #3

Multimodal

The multimodal is not active. The logs actually show than when adding a file or an image to the prompt, the prompt (incl. the file) is not even passed through the APIs.

derdide updated 3 months ago
1
showlab/Show-o #38

About multimodal sequence input

Hello, I am very interested in your great work. I see in the code that the sequence of the image generation input is basically text tokens before image tokens, what about reversing the order when gene…

tulvgengenr updated 1 month ago
4
MaartenGr/BERTopic #1918

multimodal problem

I am a student from China, and I really appreciate your project. I am now trying to do some interesting work, but I have encountered some problems. My idea is to perform topic modeling using product i…

binaryinspace updated 1 month ago
10
showlab/Show-o #27

Does show-o support multimodal-in multimodal-out?

Like what I said, does it support the title? does it multimodal-in, multimodal-out(with multi images)?

URRealHero updated 2 months ago
6
facebookresearch/multimodal #530

Probable FLAVA multimodal encoder bug

### 🚀 The feature, motivation and pitch In flava multimodal encoder, why don't we pass an attention mask to mask out '[PAD]' embeddings coming from text encoder? Is this a bug or intentional? https…

rishabhm12 updated 1 month ago
1
guinmoon/llmfarm_core.swift #25

Multimodal (Visual) LLM

Hi it looks like you updated the API a little bit in this commit https://github.com/guinmoon/llmfarm_core.swift/commit/e4e8aa7617e2e86af434677cc4196462a0005ea9 Would you mind giving an updated w…

dlazares updated 2 weeks ago
2
run-llama/create-llama #371

Support Multimodal RAG from LlamaCloud

see https://www.llamaindex.ai/blog/multimodal-rag-in-llamacloud

marcusschiesser updated 1 month ago
1
jepler/chap #42

Add multimodal support, even if rudimentary

For instance, it would be nice if you could `chap ask --attach moon.jpg "What is in this photo"`.

jepler updated 1 month ago
2
embeddings-benchmark/mteb #1249

Massive Multimodal Extension of MTEB

This issue is an overview of tasks to add for a massive multimodal extension of MTEB. The modalities are: - T=Text - I=Image - A=Audio - V=Video without audio i.e. just multiple images Below is…

Muennighoff updated 1 month ago
1
All-Hands-AI/OpenHands #4569

[Eval]: Add SWE-Bench multimodal to OpenHands Eval Harness

**What problem or use case are you trying to solve?** https://www.swebench.com/multimodal.html **Describe the UX of the solution you'd like** **Do you have thoughts on the technical implement…

xingyaoww updated 5 days ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for multimodal

1000+ results
for multimodal