yiye3 / GUICourse

GUICourse: From General Vision Langauge Models to Versatile GUI Agents
80 stars 6 forks source link

Issues with GUI-Env dataset #4

Open ZHC-6estates opened 3 months ago

ZHC-6estates commented 3 months ago

Hi, I encountered some problems when I was using the data from gui-env dataset. I randomly selected 100 pictures from the dataset and tried to draw the bboxes and annotate texts with the information from ocr_grounding_train_stage1_data.json and ocr_grounding_train_stage2_data.json. When checking the visualization results, I found some images without visible texts, but there were bbox and text annotations in the json files corresponding to the blank images.

For example, the image with image_id C4web50k-1_93837744-split-3 has no text information C4web50k-1_93837744-split-3

But annotations with the same image_id can be found in training json files. The image with visualized annotations is shown below. 36127_C4web50k-1_93837744-split-3

Also, I found some sampled images with visible texts also couldn't match the annotations in the json files, such as the image with image_id C4web50k-1_92410804-split-1.

Could you help me check whether these problems are due to defects in the source data or not? Your attention to this matter would be greatly appreciated. Thank you for your work for making such a valuable dataset.

yiye3 commented 3 months ago

Thanks for your attention and issues. I have checked the image with image_id C4web50k-1_93837744-split-3 and it's a blank image as you have shown here. Yes, these problems are due to defects in the source data. We acknowledge that there are some bad samples in our gui-env dataset, such as annotations bias and blank images, although we designed many rules to filter them.

ZHC-6estates commented 3 months ago

Thanks for your attention and issues. I have checked the image with image_id C4web50k-1_93837744-split-3 and it's a blank image as you have shown here. Yes, these problems are due to defects in the source data. We acknowledge that there are some bad samples in our gui-env dataset, such as annotations bias and blank images, although we designed many rules to filter them.

Would you continue to try to solve the problems with these data in the future releases?

yiye3 commented 3 months ago

In my experience, a small number of erroneous samples in gui-env does not affect training. We have no immediate plans to solve the problems.