Dmmm1997 / SimVG

[NeurIPS2024] - SimVG: A Simple Framework for Visual Grounding with Decoupled Multi-modal Fusion
https://arxiv.org/abs/2409.17531
MIT License
44 stars 0 forks source link

Errors occurred during training GrefCOCO dataset #4

Closed moqj closed 1 week ago

moqj commented 1 week ago

I encountered a similar issue (#3 ), and my solution was to modify the loading.py file https://github.com/Dmmm1997/SimVG/blob/b7319ca5f24cd3f46cbe566010288f83b8c9268b/simvg/datasets/pipelines/loading.py#L157-L158

def _load_expression_tokenize_beit3(self, results):
    if "GRefCOCO" in self.dataset:
        expressions = [results["ann"]["expressions"]['raw']]
    else:
        expressions = results["ann"]["expressions"]

but after that, I ran into many other issues.

  File "/home/user/workspace/SimVG/simvg/datasets/pipelines/loading.py",line 237,in _load_bboxegt_bboxes 
    copy.deepcopy(results["ann"]["bbox"])
KeyError:'bbox'

I also modified that: https://github.com/Dmmm1997/SimVG/blob/b7319ca5f24cd3f46cbe566010288f83b8c9268b/simvg/datasets/pipelines/loading.py#L224-L226

gt_bboxes = copy.deepcopy([item['bbox'] for item in results["ann"]["annotations"]][self.random_ind])

But still,

  File "/home/user/workspace/SimVG/simvg/datasets/pipelines/loading.py", line 286,in__call__
    results self._load bboxes(results)
  File "/home/gpu18/workspace/SimVG/simvg/datasets/pipelines/loading.py", line 239, in _load_bboxes
    gt_bbox[2] = gt_bbox[0] + gt_bbox[2]
TypeError:'float'object is not subscriptable

The reason might be that there is no corresponding bboxes in the image, causing the error.

result ann: [None]
result ann: [[219.85, 14.45, 319.01, 69.5]]
result ann: [[291.23, 88.74, 188.77, 269.45], [1.61, 88.74, 172.65, 266.23]]

@Dmmm1997 Could you please share the code for training on the GrefCOCO dataset?

Dmmm1997 commented 1 week ago

I have update the grefcoco dataset link.

https://seunic-my.sharepoint.cn/:f:/g/personal/230238525_seu_edu_cn/EiX51qGWa9BBlt3fVRwgPhsBUxOIZ-yW3Hm7VXOQ3c2Ipw?e=M12MM6

plz re-download the grefcoco dataset and pull the newest version of the code.