clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.53k stars 444 forks source link

How to generate own dataset in parquet format, just like the data format given by readme #209

Open chizhanyuefeng opened 1 year ago

chizhanyuefeng commented 1 year ago

Can use datasets to call load_data, is generating data also using datasets? Can you give a code sample?