CyberAgentAILab / RALF

[CVPR24 Oral] Official repository for RALF: Retrieval-Augmented Layout Transformer for Content-Aware Layout Generation
https://arxiv.org/abs/2311.13602
Apache License 2.0
96 stars 1 forks source link

About retrieval models #12

Closed wd1511 closed 6 months ago

wd1511 commented 6 months ago

Can we run this code on our custom dataset? However, the code requires loading some precomputed models from ./cache/PRECOMPUTED_WEIGHT_DIR, which seem like retrieval models. Can we train our own models, or can the corresponding models for cgl be directly used? 截屏2024-04-24 21 28 14

UdonDa commented 6 months ago

If you wanna use your personal dataset, please first preprocess your dataset as in README. Then, you need to train a layout encoder like Kikuchi21 ACMMM21. Finally, you can train RALF with your dataset and the layout encoder, which embed retrieved layouts into features.

Layout encoder should be trained with each dataset because each dataset generally has different number of class, as we write it in the limitation.

This script is for training a layout encoder https://github.com/CyberAgentAILab/RALF/blob/main/image2layout/train/fid/train.py

Here RALF loads a pretrained layout encoder https://github.com/CyberAgentAILab/RALF/blob/main/image2layout/train/models/retrieval_augmented_autoreg.py#L144

I hope you could work well!!

wd1511 commented 6 months ago

I will try the code. thanks a lot for your reply!