wjbmattingly / catmus-train

The code for training TrOCR models from the CATMuS dataset.
0 stars 1 forks source link

[Not an issue] Train on RTL scripts #2

Open johnlockejrr opened 2 weeks ago

johnlockejrr commented 2 weeks ago

Sorry to bother, is not an issue. DO you have any idea if I have to take more steps than the steps in this repo for training on a RTL language (Arabic, Hebrew, Syriac)? Thank you!

wjbmattingly commented 2 weeks ago

Thanks for reaching out! Great question. It depends on a lot of factors. How much data do you have?

johnlockejrr commented 2 weeks ago

Thanks for replying! Depends on the script. This is not a problem, I do annotate my own data, I normally don't work with public one, only for testing.

johnlockejrr commented 1 week ago

So, worth a try?

I would like to try it on some rabbinical (medieval Hebrew) manuscripts also.