Open yinglang opened 4 days ago
Thank you for your interest in our research.
I realize that it might have been a bit inconvenient as we have not yet released the code for data preprocessing, which I am still organizing.
As mentioned in our paper, we conducted our research by translating Chinese into English. Additionally, when examining the CGL-v2 dataset, out of 60,548 images, only 39,931 images have text annotations. In our research, we aimed to utilize annotations for all text categories.
For elements without provided text annotations, we used an OCR module to extract the text, translated it, and then used it in the study. I am attaching a pseudo-code for this process.
We treated the resulting JSON file as train.json and processed it using the convertHTML/build_code_jj2.py code for training. Since many intermediate steps have been omitted, until I upload the finalized code, I am sharing the final processed file, html_format_img_instruct_mask_all_condition_text.zip, for your reference training.
Just changing the config.train_json and val_json of training_config, there may be no problem for training. I will release the entire code as soon as possible.
If you encounter any issues, please feel free to reach out. I'd be happy to assist!
I am very pleased to receive your reply. These codes and data are very helpful to me!!!
If it's not too much trouble, I would like to kindly inquire whether it would be possible to release the weights obtained from the training of the stage_1 model ("log_dir/train_stage1_dino_code_llama/checkpoints/checkpoint-18/pytorch_model.bin")? I would greatly appreciate your assistance in this matter.
Thank you for interesting!
Many Thanks!!! Can you share the answer: How much GPU memory will cost during stage2 training?
While i run build_code_jj2.py, I got a 'text' KeyError:
Then, After carefully reading the paper, I got some related information as:
If my understanding is accurate, it seems that the 'text' originates from the translated content within the CGL-V2 dataset. I apologize for any inconvenience, but I would be incredibly grateful if it might be possible for you to share the train.json file with the 'text' attribute included. Thank you very much for your assistance.