issues
search
AviSoori1x
/
seemore
From scratch implementation of a vision language model in pure PyTorch
MIT License
147
stars
11
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
start token for the inference
#4
audreyeternal
opened
1 week ago
0
current_output gets positional embeddings added to it multiple times in LM generate()?
#3
thuann2cats
opened
1 month ago
0
Load pretrained weight from CLIP or LLAVA
#2
hongsamvo
closed
2 months ago
3
Loss function during pre-training the Projector
#1
dame-cell
closed
4 months ago
1