clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.52k stars 443 forks source link

VisionEncoderDecoderModel convert #284

Open sjtu-cz opened 5 months ago

sjtu-cz commented 5 months ago

How to convert the trained donut model into the model structure of VisionEncoderDecoderModel?

felixvor commented 4 months ago

Smells like an xy-problem, what exactly are you trying to do? Importing a donut model with the huggingface VisionEncoderDecoder implementation should be straight forward. Just make sure you use the right DonutTokenizer with it. The docs should cover what you are looking for: https://huggingface.co/docs/transformers/model_doc/donut

sjtu-cz commented 4 months ago

您好,你的邮件我已经收到~