The quantity of the open-source training data does not match that mentioned in the paper.

mbzuai-oryx / GeoChat

[CVPR 2024 🔥] GeoChat, the first grounded Large Vision Language Model for Remote Sensing

https://mbzuai-oryx.github.io/GeoChat

451 stars 36 forks source link

The quantity of the open-source training data does not match that mentioned in the paper. #21

Open zytx121 opened 8 months ago

zytx121 commented 8 months ago

Thank you very much for your work！

I discovered that the quantity of the open-source training data does not match that mentioned in the paper. When using a global batch size of 144, the number of iterations I trained for is 2144, while the paper indicates 2400.

KjAeRsTuIsK commented 8 months ago

Hi @zytx121 , we further finetuned our model on just the grounding part of the dataset for some more steps.

baichuanzhou commented 8 months ago

Hi @zytx121 , we further finetuned our model on just the grounding part of the dataset for some more steps.

Hi, thanks for your great work. Would you mind also open-sourcing the specific grounding dataset that you used in stage2? Thanks in advance.

KjAeRsTuIsK commented 8 months ago

You can filter the data from the geochat_instruct file using the [refer] and [grounding] keywords.

zytx121 commented 8 months ago

Thank you for your answer. After filtering, my training sample iteration count is 1472, which still does not match the over 1600 in the paper.

baichuanzhou commented 8 months ago

You can filter the data from the geochat_instruct file using the [refer] and [grounding] keywords.

Does stage 2 also have a batch size of 144? @KjAeRsTuIsK

baichuanzhou commented 8 months ago

Thank you for your answer. After filtering, my training sample iteration count is 1472, which still does not match the over 1600 in the paper.

May I ask how did you filter the data? I used this script to find out how many samples have [refer] or [grounding] keywords:

for sample in data:
     conversations = sample['conversations']
     if any('[grounding]' in conv['value'] or '[refer]' in conv['value'] for conv in conversations):
             num_grounding += 1

num_grounding is 70464. If we set the batch size to 144, it only has 490 iterations. @zytx121