multimodal-llms Search Results

406 results
for multimodal-llms

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

paperswithlove/papers-we-read #37

Cambrian-1: A Fully Open, Vision-Centric Exploration of Mult…

Paper : [https://arxiv.org/pdf/2406.16860](https://arxiv.org/pdf/2406.16860) Website : [https://cambrian-mllm.github.io](https://cambrian-mllm.github.io) Code : [https://github.com/cambrian-mllm/cam…

runhani updated 4 months ago
3
XYIheng/MobileAppTesting #5

please add recent papers on finding functional bugs in Andro…

Automating GUI-based Test Oracles for Mobile Apps (MSR'24) A Study of Using Multimodal LLMs for Non-Crash Functional Bug Detection in Android Apps (https://arxiv.org/pdf/2407.19053) AUITestAgent: …

tingsu updated 1 month ago
5
luban-agi/Awesome-AIGC-Tutorials #4

Resource: Video Tutorial on Multimodal Data Analysis with LL…

[This video tutorial](https://youtu.be/gLiCIek38t0) introduces beginners to multimodal data analysis with LLMs and Python. Topics covered: - Classifying text - Analyzing images - Transcribing au…

itrummer updated 4 months ago
1
AkihikoWatanabe/paper_notes #1503

GUI Agents with Foundation Models: A Comprehensive Survey, S…

# URL - https://arxiv.org/abs/2411.04890 # Authors - Shuai Wang - Weiwen Liu - Jingxuan Chen - Weinan Gan - Xingshan Zeng - Shuai Yu - Xinlong Hao - Kun Shao - Yasheng Wang - Ruimi…

AkihikoWatanabe updated 2 weeks ago
2
argilla-io/argilla #4739

[FEATURE] Image Bounding Box Annotations for vision & multim…

Hi team! Given the awesome SpanQuestion new feature recently released, I'm tempted to ask if it's possible to have the same done for annotating regions of interests for **images**. It would be marvel…

JonnyTran updated 1 month ago
7
run-llama/llama_index #16889

[Documentation]: connect SambaNova Cloud - Llama-3.2-11B-Vis…

### Documentation Issue Description Why is there no information in the documentation about connecting VLM models via the SambaNova provider API? * Code from the SambaNova page: ```python import …

hherpa updated 2 weeks ago
3
rubato-yeong/comments #4

multimodal/anyref/

# [24’ CVPR] AnyRef: Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception - Blog by rubatoyeong Find Directions [https://rubato-yeong.github.io/multimodal/anyref/](https://rubato-…

utterances-bot updated 4 months ago
2
BradyFU/Awesome-Multimodal-Large-Language-Models #191

Addition: GITA-7B/13B & GVLQA Dataset(Accepted by NeurIPS 20…

Dear Authors, We'd like to add "GITA: Graph to Visual and Textual Integration for Vision-Language Graph Reasoning" to this repository, which has been accepted by NeurIPS 2024. [**Paper**](https:/…

WEIYanbin1999 updated 1 week ago
3
ollama/ollama #4257

Support for InternVL-Chat-V1.5

https://huggingface.co/OpenGVLab/InternVL-Chat-V1-5 We introduce InternVL 1.5, an open-source multimodal large language model (MLLM) to bridge the capability gap between open-source and proprietary…

wwjCMP updated 2 months ago
5
BradyFU/Awesome-Multimodal-Large-Language-Models #188

Inquiry for adding new paper

Hi, Thanks for your efforts on such a valuable collection! Could you please add the paper "Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate"? M…

shikiw updated 2 weeks ago
1

上一页 1...1 2 3 4 5 6 7...41 下一页

406 results for multimodal-llms

406 results
for multimodal-llms