visual-language-models Search Results

modelscope/ms-swift #2226

Could you put attention_mask back for visual language models

**Describe the feature** class Template里面可以做padding,但是Qwen2VLTemplateMixin, InternLMXComposer2Template里面只有im_mask，没有，input_ids的attention_mask,（有PADDING的情形）能不能把padding attention_mask都放回去呀。 http…

YerongLi updated 1 month ago

liuting20/MustDrop #1

can't wait to try!

Nice work! Can't wait to try your work, I wonder when the code will be released! By the way, I don't know if you know this paper, "BOOSTING MULTIMODAL LARGE LANGUAGE MODELS WITH VISUAL TOKENS WITH…

yuanrr updated 5 days ago

BradyFU/Awesome-Multimodal-Large-Language-Models #184

Add SALMONN, video-SALMONN, video-SALMONN 2

TCL606 updated 3 weeks ago

QuivrHQ/quivr #3452

Enable use of embeddings from Vision Language models

Currently, we use text embeddings. This is fine for textual documents, while it present obvious drawbacks for documents containing non-textual content (images, graphs, schemes, …). An alternative, is…

jacopo-chevallard updated 2 weeks ago

CopilotKit/CopilotKit #241

gpt-4-vision-preview integration

**Is your feature request related to a problem? Please describe.** No **Describe the solution you'd like** Integrate gpt-4-vision and more generally visual language models using LangChain, by fir…

TimeLordRaps updated 2 weeks ago

UppuluriKalyani/ML-Nexus #714

Feature Request: Model Evaluation and Benchmarking System

I propose adding a Model Evaluation and Benchmarking System to ML Nexus to help users assess their model performance on standardized datasets and compare it against benchmarked scores. This feature wo…

snehas-05 updated 3 weeks ago

yoheikikuta/paper-reading #57

[2103.00020] Learning Transferable Visual Models From Natura…

## 論文リンク https://arxiv.org/abs/2103.00020 ## 公開日（yyyy/mm/dd） 2021/01/05 ## 概要 OpenAI が発表した DALL·E の中で reranking にも使われていた CLIP (Contrastive Language-Image Pre-training) の論文。 Web 上のテキストから特別な a…

yoheikikuta updated 3 months ago

microsoft/powerbi-client-react #116

Cant fit visual in container

I'm unable to fit a visual embed in 100% width of the parent container these are the basic settings: ``` export const VISUAL_SETTINGS: models.ISettings = { localeSettings: { language: "en-…

alan9518 updated 1 month ago

huggingface/transformers #33905

Implement LlamaGen for Image Generation

### Feature request Add support for LlamaGen, an autoregressive image generation model, to the Transformers library. LlamaGen applies the next-token prediction paradigm of large language models to vi…

ighoshsubho updated 3 weeks ago

irthomasthomas/undecidability #892

Vespa 🤝 ColPali: Efficient Document Retrieval with Vision La…

- [ ] [Vespa 🤝 ColPali: Efficient Document Retrieval with Vision Language Models — pyvespa documentation](https://pyvespa.readthedocs.io/en/latest/examples/colpali-document-retrieval-vision-language-m…

ShellLM updated 3 months ago

1000+ results for visual-language-models

1000+ results
for visual-language-models