Finze tuning Donut for UI tasks such as RefExp

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

https://arxiv.org/abs/2111.15664

MIT License

5.75k stars 466 forks source link

Finze tuning Donut for UI tasks such as RefExp #124

Open ivelin opened 1 year ago

ivelin commented 1 year ago

Thank you for sharing your great work on Donut!

I've been experimenting with it and see some promising results in fine tuning on UI tasks such as RefExp that are usually targeted by specialized models such as pix2struct, UIBert, seq2act.

https://huggingface.co/spaces/ivelin/ui-refexp

If anyone is interested in collaborating further on UI tasks, please let me know.

Regards!