As the title desscribed, I want to train the model for my VQA dataset in Vietnamese. Does the OCR part and Embedding part support Vietnamese. Or I have to customize it myself?
It is possible to do this. You will have to change three things:
Since, we don't have a public version of Rosetta OCR model, I will suggest you to extract out the OCR tokens using Google Cloud Vision API. Create an imdb for your questions in same format as the one for TextVQA with the OCR tokens you just extracted out. Checkout https://cloud.google.com/vision/docs/ocr (specifically 'Specify a language part') and https://cloud.google.com/vision/docs/languages
❓ Questions and Help
As the title desscribed, I want to train the model for my VQA dataset in Vietnamese. Does the OCR part and Embedding part support Vietnamese. Or I have to customize it myself?