issues
search
TIGER-AI-Lab
/
VLM2Vec
This repo contains the code and data for "VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks"
https://tiger-ai-lab.github.io/VLM2Vec/
Apache License 2.0
80
stars
1
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
How to extend this to more modalities?
#11
nikhilbyte
opened
1 day ago
2
Code does not support the training of Llama-3.2-Vision
#10
haon-chen
opened
1 day ago
1
The dataset is missing images
#9
WGowi
opened
4 days ago
1
How to train the model with my own data?
#8
B-201
closed
1 week ago
6
Locally loda model VLM2Vec-Full
#7
VincentVanNF
opened
1 week ago
5
Hidden size mismatch
#6
marcobellagente93
closed
2 weeks ago
2
Could this model provide embeddings for videos?
#5
lly0571
opened
2 weeks ago
1
Which Layer's Output are used for Contrastive Training
#4
VincentVanNF
closed
2 weeks ago
6
Question about the configuration
#3
URRealHero
closed
1 month ago
4
The results are better than those in the paper
#2
B-201
closed
2 weeks ago
3
Use Qwen2-VL as backbone
#1
VoVAllen
closed
1 month ago
1