issues
search
FoundationVision
/
Groma
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization
https://groma-mllm.github.io/
Apache License 2.0
479
stars
56
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
why the number of tokens in LLM is dynamic?
#18
liuting20
opened
5 days ago
1
test , find error, local variable 'sentencepiece_model_pb2' referenced before assignment
#17
ovjust
opened
6 days ago
6
About pretrain checkpoint
#16
xuliu-cyber
closed
3 weeks ago
1
8bit和4bit量化版本推理报错
#15
zhangron013
closed
1 month ago
2
Batch size setting in the evaluation process
#14
rongfu-dsb
closed
4 weeks ago
4
Tested some images and felt that the grounding ability was weakened a lot compared to the original DINO?
#13
TiantZhang
closed
1 month ago
1
About grouding output
#12
nguyenquivinhquang
closed
1 month ago
1
Clarify the bounding box format
#11
nguyenquivinhquang
closed
3 weeks ago
2
Could you share the prompts to instruct gpt4v to create the groma instruct ?
#10
Yang-bug-star
closed
1 month ago
1
4 bit model
#9
vcadillog
closed
1 month ago
1
有没有小一点的模型? 24G现存可用的
#8
traddo
closed
1 month ago
5
evaluation results significantly different
#7
xiaoyazhu
closed
1 month ago
1
model weight problem
#6
liukc19
closed
1 month ago
7
No groma conversation template
#5
Gabesarch
closed
1 month ago
1
Update README.md
#4
eltociear
closed
2 months ago
0
unable to load local weight
#3
liukc19
closed
2 months ago
5
Finetuning and dataset formatting guidelines
#2
hinsonan
closed
1 month ago
2
System requirements for running the model ?
#1
learnermaxRL
closed
2 months ago
1