vision-language-model Search Results

1000+ results
for vision-language-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NixOS/nixpkgs #340581

Package request: nexaai

**Project description** Nexa SDK is a comprehensive toolkit for supporting ONNX and GGML models. It supports text generation, image generation, vision-language models (VLM), auto-speech-recognition…

TanvirOnGH updated 3 weeks ago
1
xlinx/sd-webui-decadetw-auto-prompt-llm #16

LM Studio: [ERROR] Model does not support images. Please use…

After resolving my LLM-as-assistant issue, I am now having issues using the LLM-vision. I have the models suggested on the github, as seen below, but every single one of them returns [ERROR] Model doe…

AlexDenthanor updated 3 weeks ago
1
huggingface/trl #2097

Supports of PPOTrainer / DPOTrainer for Qwen2Audio

### Feature request Enable PPOTrainer and DPOTrainer to work with audio-language models like Qwen2Audio. Architecture for this model is identical to vision-language models like LlaVa, consisting of…

jonflynng updated 1 week ago
1
Erlemar/andlukyane-comments #8

https://andlukyane.com/blog/paper-review-eve

# Paper Review: Unveiling Encoder-Free Vision-Language Models – Andrey Lukyanenko My review of the paper Unveiling Encoder-Free Vision-Language Models [https://andlukyane.com/blog/paper-review-eve](…

utterances-bot updated 2 months ago
2
google-research/scenic #1088

please open source the code of Unified Visual Relationship D…

Hello, can you please open source the code of Unified Visual Relationship Detection with Vision and Language Models? I am very interested in your training method for non-relationship data.

Pzzzzzzzzz updated 3 weeks ago
2
InternLM/lmdeploy #2496

[Feature] Will multi-modal models support W8A8 quantization …

### Motivation Our business model (Internvl 2-26B) outputs very few tokens (1-2 tokens) after prompt optimization, which can be considered as only the prefill stage. Therefore, we hope to use W8A8 qu…

MenglingD updated 6 days ago
6
langgenius/dify #8777

Using Vision model "minicpm-v:8b-2.6-q4_K_M" with ollama, ch…

### Self Checks - [X] This is only for bug report, if you would like to ask a question, please head to [Discussions](https://github.com/langgenius/dify/discussions/categories/general). - [X] I have s…

HorseLuke updated 5 days ago
4
VishnuPJ/Malayalam_Speech2Speech_Vision_Text_Chatbot #1

Can we manage the model separately?

Instead of relying on `HF`, can we link to the `gguf` file? I tried converting the code using claude, but its failing. Any chance you can give some pointers to use a local file than going via HF?

jikkuatwork updated 1 month ago
4
duncancalvert/SkySearch #2

LLM Testing 1 - Read up on GPT-4o for Vision Benchmark

1. Read up on GPT-4o vision accuracy, precision, and recall benchmarks for ceiling baselines. Document research papers on our paper.

duncancalvert updated 6 days ago
1
Paitesanshi/LLM-Agent-Survey #28

Seamlessly integrate state-of-the-art transformer models int…

Hi friends! I'd like to share our recent project embodied-agents: https://github.com/mbodiai/embodied-agents, which makes it easy to integrate large multi-modal models into existing robot stacks wi…

nqyy updated 1 month ago
1

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for vision-language-model

1000+ results
for vision-language-model