Closed Stanleyluuuu closed 2 months ago
This model hasn’t been trained for grounding, so it doesn’t effectively output bounding boxes (bbx) for grounding tasks.
A good suggestion would be to fine-tune the model using a labeled dataset, like the one you mentioned with bbx, to improve its grounding capabilities. However, this process can be complex, particularly in terms of preparing the dataset, which poses a significant challenge.
OK, I understand. Thanks for the clear explanation.
System Info / 系統信息
Hi,
I'm using GLM-4v-9B to develop a feature that allows users to input an image and receive the corresponding bounding box. For example, the prompt might be: "Is there any person fall down? Give me the bounding box in (x1, y1, x2, y2) format if exists."
However, I noticed that the bounding box does not fully enclose the person who has fallen. Could you provide any guidance or instructions regarding the bounding box output?
Who can help? / 谁可以帮助到您?
No response
Information / 问题信息
Reproduction / 复现过程
Expected behavior / 期待表现
I expect to understand how to guide the model to output bounding box coordinate in the format I want.