issues
search
TRI-ML
/
prismatic-vlms
A flexible and efficient codebase for training visually-conditioned language models (VLMs)
MIT License
325
stars
86
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
One question about running inference
#40
Li-private
opened
1 day ago
0
Multinode training
#39
budzianowski
opened
2 weeks ago
0
move barrier to before saving checkpoint to reduce timeouts when saving
#38
jensen-gao
opened
2 weeks ago
0
Revert "OpenVLA Release"
#37
siddk
closed
2 weeks ago
0
OpenVLA Release
#36
siddk
closed
2 weeks ago
0
`clip_grad_norm_` in fsdp
#35
zeyuanyin
closed
1 month ago
1
`unpack_tuple()` is no longer correct with timm v1.0.3
#34
yukw777
opened
1 month ago
2
Questions about mixed-precision training for visual backbones
#33
tayton42
closed
1 month ago
2
The reproduction issue of DINOv2 + SigLIP 384px (Naive Resize)
#32
tayton42
closed
1 month ago
12
Improved Documentation for Launching Training Runs
#31
RylanSchaeffer
opened
1 month ago
0
question about training data
#30
fyting
closed
1 month ago
1
question about vit's fsdp wrapping policy
#29
lukaemon
closed
1 month ago
2
Adding more VLMs based on Llama 3 Instruct, Gemma 3 Instruct, Mistral v0.2 Instruct, Phi 3 Instruct
#28
RylanSchaeffer
closed
1 month ago
4
Add New LLM Backbones
#27
siddk
closed
1 month ago
0
Cleanup and Lint, add 224px Prism Models
#26
siddk
closed
1 month ago
0
Need help when incorporating LLama3
#25
Hannibal046
closed
1 month ago
1
Update README.md
#24
SamuelSchmidgall
closed
1 month ago
1
Fix for error: "File setup.py not found" when running "pip install -e ."
#23
SamuelSchmidgall
closed
2 months ago
0
Why second to last layer Vision Transformer features?
#22
gorjanradevski
closed
2 months ago
1
Can I training your code with V100 GPU
#21
lijiannuist
closed
2 months ago
1
How to finetune starting from a Prismatic VLM checkpoint
#20
djghosh13
closed
1 month ago
1
Feature request
#19
lucasjinreal
closed
1 month ago
1
Love your code!
#18
Hannibal046
closed
2 months ago
1
Multi-dataset support issue.
#17
tayton42
closed
2 months ago
1
Llava / Prismatic with LoRA
#16
gorjanradevski
opened
2 months ago
6
training error
#15
tayton42
closed
2 months ago
2
Inconsistent API for Vision Backbones?
#14
RylanSchaeffer
closed
2 months ago
13
Does this software support CogVLM?
#13
PhilipAmadasun
closed
2 months ago
1
Quantization support
#12
show981111
opened
2 months ago
2
Any plan to support "Dynamic High Resolution" proposed in LLaVA v.16?
#11
yushuinanrong
closed
2 months ago
1
Do you have plan to add SAM as a visual encoder?
#10
StarCycle
opened
3 months ago
5
Why do you use the dinov2 from timm, instead of the facebook version or the Huggingface transformers implementation?
#9
StarCycle
closed
3 months ago
1
Training a one-stage model
#8
shikhar-srivastava
closed
3 months ago
3
Incorrect type annotation & slightly unintuitive functionality of `available_model_names()`
#7
RylanSchaeffer
closed
1 month ago
1
Adding support for other base LLMs
#6
shikhar-srivastava
closed
1 month ago
7
evaluatio scripts
#5
TobiasLee
closed
4 months ago
1
Have you tried only dinov2 as vision encoder?
#4
LinB203
closed
4 months ago
1
Does it support input format of multiple images + text?
#3
swj0419
closed
4 months ago
2
pretrain.py won't run with arguments
#2
rpgrainger-ai
closed
4 months ago
2
How to extract the last hidden outputs from the models?
#1
swj0419
closed
4 months ago
3