visual-language-models Search Results

1000+ results
for visual-language-models

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

yyf17/NavigationProject #8

CVPR 2022

CVPR 2022 # 格式 * **Paper Title** *Author(s)* CVPR, 2022. [[Paper]](link) [[Code]](link) [[Website]](link) 需要填充： 1）Paper Title 2） Author(s) 3） 3个“link” 4）两篇文章之间间隔一行 # agent Meta Ag…

yyf17 updated 2 years ago
1
LAION-AI/CLIP_benchmark #97

Add compositionality benchmarks

- CREPE: https://openaccess.thecvf.com/content/CVPR2023/papers/Ma_CREPE_Can_Vision-Language_Foundation_Models_Reason_Compositionally_CVPR_2023_paper.pdf, https://github.com/RAIVNLab/CREPE - ARO https…

mehdidc updated 4 weeks ago
8
withinmiaov/A-Survey-on-Mixture-of-Experts #5

New MoE-related work in application

GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation https://arxiv.org/abs/2410.11841

csyfjiang updated 3 weeks ago
1
huggingface/transformers #17224

ALBEF: Align Before Fuse

### Model description Align Before Fuse (ALBEF) is a vision-language (VL) model that showed competitive results in numerous VL tasks such as image-text retrieval, visual question answering, visual …

ggoggam updated 5 months ago
7
huggingface/transformers #32435

[i18n-ar] Translating docs to Arabic (العربية)

Hi! !مرحبا! السلام عليكم Let's bring the documentation to all the Arabic-speaking community 🌏 (currently 0 out of 267 complete) Would you want to translate? Please follow the 🤗 [TRANSLATING guid…

AhmedAlmaghz updated 1 month ago
2
huggingface/transformers #30638

Add Prismatic VLMs to Transformers

### Model description Hi! I'm the author of ["Prismatic VLMs"](https://github.com/TRI-ML/prismatic-vlms), our upcoming ICML paper that introduces and ablates design choices of visually-conditioned …

siddk updated 6 months ago
5
Cruiz102/Advesarial_Attacks_Tests #1

Add optimizations to the training loop.

To enable efficient training on GPUs and scale our repository for models with millions to billions of parameters—essential for working with large visual language models—we must implement optimization …

Cruiz102 updated 8 months ago
1
sign-language-processing/spoken-to-signed-translation #26

March 2024 Progress

We now use the improved [`pose-to-video`](https://github.com/sign-language-processing/pose-to-video) based on diffusion models. We start with a paragraph in German, translate it to German Sign Lan…

AmitMY updated 2 months ago
1
amir9979/reading_list #1392

clinical question answering "large language models" - new r…

*Sent by Google Scholar Alerts (scholaralerts-noreply@google.com). Created by [fire](https://fire.fundersclub.com/).* --- ### ### [PDF] [Evaluating **large language models** in **medical** applicat…

fire-bot updated 3 months ago
1
bigshanedogg/survey #21

[FROZEN] Multimodal Few-Shot Learning with Frozen Language M…

## Problem statement 1. Despite the impressive capabilities of large scale language models, the potential to modalities has not been fully demonstrated other than text. 2. Aligning parameters of vi…

bigshanedogg updated 2 years ago
1

上一页 1...6 7 8 9 10 11 12...100 下一页

1000+ results for visual-language-models

1000+ results
for visual-language-models