How can i get the annotation file with "text" attribute?

yinglang commented 4 days ago

While i run build_code_jj2.py, I got a 'text' KeyError:

python convertHTML/build_code_jj2.py   --model_path_or_name models/meta-llama/Llama-2-7b-chat-hf  --dataset_name cgl --dataset_path data/cgl_dataset/for_posternuwa  --save_path data/cgl_dataset/for_posternuwa/html_format_img_instruct_mask_all_condition  --bbox_quantization code  --consistency_num 15  --add_task_instruction;

Processing...
Traceback (most recent call last):
  File "PosterLlama/convertHTML/build_code_jj2.py", line 756, in <module>
    train_dataset = get_dataset(
  File "PosterLlama/convertHTML/__init__.py", line 20, in get_dataset
    return CGLDataset(datapath,split,max_seq_length=25,transform=transform)
  File "PosterLlama/convertHTML/cgl.py", line 18, in __init__
    super().__init__(dir,split,transform)
  File "PosterLlama/convertHTML/base.py", line 18, in __init__
    super().__init__(self.path, transform)
  File "python3.10/site-packages/torch_geometric/data/in_memory_dataset.py", line 81, in __init__
    super().__init__(root, transform, pre_transform, pre_filter, log,
  File "python3.10/site-packages/torch_geometric/data/dataset.py", line 115, in __init__
    self._process()
  File "python3.10/site-packages/torch_geometric/data/dataset.py", line 262, in _process
    self.process()
  File "PosterLlama/convertHTML/cgl.py", line 50, in process
    te = element['text']
KeyError: 'text'

Then, After carefully reading the paper, I got some related information as:

Additionally, because both CGL and PKU datasets do not provide text annotations, we opt for the CGL-v2 dataset to facilitate layout generation that incorporates both textual and visual content.

Firstly, to accommodate the code language format suitable for CodeLlama, we translate the Chinese words provided in CGL-v2 into English for application in CodeLlama.

If my understanding is accurate, it seems that the 'text' originates from the translated content within the CGL-V2 dataset. I apologize for any inconvenience, but I would be incredibly grateful if it might be possible for you to share the train.json file with the 'text' attribute included. Thank you very much for your assistance.

jaepoong commented 3 days ago

Thank you for your interest in our research.

I realize that it might have been a bit inconvenient as we have not yet released the code for data preprocessing, which I am still organizing.

As mentioned in our paper, we conducted our research by translating Chinese into English. Additionally, when examining the CGL-v2 dataset, out of 60,548 images, only 39,931 images have text annotations. In our research, we aimed to utilize annotations for all text categories.

For elements without provided text annotations, we used an OCR module to extract the text, translated it, and then used it in the study. I am attaching a pseudo-code for this process.

We treated the resulting JSON file as train.json and processed it using the convertHTML/build_code_jj2.py code for training. Since many intermediate steps have been omitted, until I upload the finalized code, I am sharing the final processed file, html_format_img_instruct_mask_all_condition_text.zip, for your reference training.

jaepoong commented 3 days ago

Just changing the config.train_json and val_json of training_config, there may be no problem for training. I will release the entire code as soon as possible.

jaepoong commented 3 days ago

If you encounter any issues, please feel free to reach out. I'd be happy to assist!

yinglang commented 3 days ago

I am very pleased to receive your reply. These codes and data are very helpful to me！！！

yinglang commented 3 days ago

If it's not too much trouble, I would like to kindly inquire whether it would be possible to release the weights obtained from the training of the stage_1 model ("log_dir/train_stage1_dino_code_llama/checkpoints/checkpoint-18/pytorch_model.bin")? I would greatly appreciate your assistance in this matter.

jaepoong commented 3 days ago

I uploaded the 1st stage training models on Link

jaepoong commented 3 days ago

Thank you for interesting!

yinglang commented 3 days ago

Many Thanks!!! Can you share the answer: How much GPU memory will cost during stage2 training?

jaepoong / PosterLlama

How can i get the annotation file with "text" attribute? #6