Open zytx121 opened 8 months ago
Hi @zytx121 , we further finetuned our model on just the grounding part of the dataset for some more steps.
Hi @zytx121 , we further finetuned our model on just the grounding part of the dataset for some more steps.
Hi, thanks for your great work. Would you mind also open-sourcing the specific grounding dataset that you used in stage2? Thanks in advance.
You can filter the data from the geochat_instruct file using the [refer] and [grounding] keywords.
Thank you for your answer. After filtering, my training sample iteration count is 1472, which still does not match the over 1600 in the paper.
You can filter the data from the geochat_instruct file using the [refer] and [grounding] keywords.
Does stage 2 also have a batch size of 144? @KjAeRsTuIsK
Thank you for your answer. After filtering, my training sample iteration count is 1472, which still does not match the over 1600 in the paper.
May I ask how did you filter the data? I used this script to find out how many samples have [refer] or [grounding] keywords:
for sample in data:
conversations = sample['conversations']
if any('[grounding]' in conv['value'] or '[refer]' in conv['value'] for conv in conversations):
num_grounding += 1
num_grounding
is 70464. If we set the batch size to 144, it only has 490 iterations. @zytx121
Thank you very much for your work!
I discovered that the quantity of the open-source training data does not match that mentioned in the paper. When using a global batch size of 144, the number of iterations I trained for is 2144, while the paper indicates 2400.