VisionEncoderDecoderModel convert

clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022

https://arxiv.org/abs/2111.15664

MIT License

5.52k stars 443 forks source link

VisionEncoderDecoderModel convert #284

Open sjtu-cz opened 5 months ago

sjtu-cz commented 5 months ago

How to convert the trained donut model into the model structure of VisionEncoderDecoderModel?

felixvor commented 4 months ago

Smells like an xy-problem, what exactly are you trying to do? Importing a donut model with the huggingface VisionEncoderDecoder implementation should be straight forward. Just make sure you use the right DonutTokenizer with it. The docs should cover what you are looking for: https://huggingface.co/docs/transformers/model_doc/donut

sjtu-cz commented 4 months ago

您好，你的邮件我已经收到~