clovaai / donut

Official Implementation of OCR-free Document Understanding Transformer (Donut) and Synthetic Document Generator (SynthDoG), ECCV 2022
https://arxiv.org/abs/2111.15664
MIT License
5.75k stars 466 forks source link

How to train the model for supporting Arabic Language #129

Open Abdullamhd opened 1 year ago

Abdullamhd commented 1 year ago

Hello, How I can train the model to support Arabic language what i understand from this issue https://github.com/clovaai/donut/issues/77 arabic may not supported , but we have team they can create a custom Arabic tokenizer , but we need more info on how to do it and how to integrate it to donut

thanks in advance

Wyzix33 commented 1 year ago

https://github.com/clovaai/donut/issues/135#issuecomment-1426910455