FoundationVision Groma issues - Githubissues

FoundationVision / Groma

[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization

https://groma-mllm.github.io/

Apache License 2.0

479 stars 56 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

why the number of tokens in LLM is dynamic?

#18 liuting20 opened 5 days ago
1
test , find error, local variable 'sentencepiece_model_pb2' referenced before assignment

#17 ovjust opened 6 days ago
6
About pretrain checkpoint

#16 xuliu-cyber closed 3 weeks ago
1
8bit和4bit量化版本推理报错

#15 zhangron013 closed 1 month ago
2
Batch size setting in the evaluation process

#14 rongfu-dsb closed 4 weeks ago
4
Tested some images and felt that the grounding ability was weakened a lot compared to the original DINO？

#13 TiantZhang closed 1 month ago
1
About grouding output

#12 nguyenquivinhquang closed 1 month ago
1
Clarify the bounding box format

#11 nguyenquivinhquang closed 3 weeks ago
2
Could you share the prompts to instruct gpt4v to create the groma instruct ?

#10 Yang-bug-star closed 1 month ago
1
4 bit model

#9 vcadillog closed 1 month ago
1
有没有小一点的模型？ 24G现存可用的

#8 traddo closed 1 month ago
5
evaluation results significantly different

#7 xiaoyazhu closed 1 month ago
1
model weight problem

#6 liukc19 closed 1 month ago
7
No groma conversation template

#5 Gabesarch closed 1 month ago
1
Update README.md

#4 eltociear closed 2 months ago
0
unable to load local weight

#3 liukc19 closed 2 months ago
5
Finetuning and dataset formatting guidelines

#2 hinsonan closed 1 month ago
2
System requirements for running the model ?

#1 learnermaxRL closed 2 months ago
1