Closed liukc19 closed 2 months ago
env:
I tried to export the pred_boxes and drew them in the pictures.
The red bouding boxes are ground-truth boxes, and the blue boxes correspond to the prediction results and the model proposals respectively. It seemed that the poor prediction results was caused by bad proposals.
As for why the loss decreased during the training process, I believe it was because the ground truth bounding boxes are injected.
Thanks for your help in advance.
I also tried the weight of groma-7b-finetune, but got the same result. Is it possible that these errors come from this commit? #3
Maybe the real weight is in this path "vis_encoder_path": "checkpoints/dinov2-large"
?
Hi there, sorry for the late reply. I agree that the problem probably originates from model initialization. Could you please have a try by downloading the DINOv2 checkpoint, and changing line 104-107 in groma/model/ddetr.py
from
if pretrained_vis_encoder is not None:
self.vis_encoder = Dinov2Model.from_pretrained(pretrained_vis_encoder)
else:
self.vis_encoder = Dinov2Model(config.vis_encoder_cfg)
to
self.vis_encoder = Dinov2Model.from_pretrained({path_to_dinov2_ckpt})
, which forces the model to load DINOv2 pretrain for initialization.
thank u for ur suggestions, i'll try it later
I tried the method u suggested but got the same result can you reproduce the result in your local environment(with finetuned model weight)?
I found this error was caused by a mis-config of hyper parameter nms_thres
, which was set to 0.0 but should be 0.6. It is now fixed in the latest commit. Please feel free to have a try.
I tried to finetune groma on REC dataset only(Refcoco/+/g), but get bad result on refcoco_val (with
iou@0.5 accu
andm_iou
about 0.5). I also tried to evaluate groma on refcoco_val withgroma-7b-pretrain weight
and get the following result.Is this result normal?