facebookresearch / nougat

Implementation of Nougat Neural Optical Understanding for Academic Documents
https://facebookresearch.github.io/nougat/
MIT License
8.81k stars 560 forks source link

Will Nougat support Turkish and German utf-8 characters? #191

Open ceofast opened 8 months ago

ceofast commented 8 months ago

I'm currently working on the Nougat Facebook library and came across a question regarding language support. In particular, I noticed that the library does not support UTF characters for Turkish and German languages. These character sets contain unique characters not found in standard ASCII encoding.

Given Facebook's diverse user base, the inclusion of full UTF support for these languages would be greatly beneficial. Will the Nougat library currently support UTF characters for Turkish and German? If not, do you have plans to include this support in future updates?

Kind regards

nise commented 5 months ago

Any updates here?

ceofast commented 4 months ago

Any updates here?

Nothing changed. I wanted to use MT5 instead of MBard instead of decoder, but I'm getting a size error. Maybe it can be solved with MT5.

ceofast commented 4 months ago

Any updates here?

Nothing changed. I wanted to use MT5 instead of MBard instead of decoder, but I'm getting a size error. Maybe it can be solved with MT5.