Open dangcaptkd2 opened 3 years ago
Hi dangcaptkd2,
I am trying to do Image Captioning on the Arabic language and have the same problem! Can you please share how you used PhoBert for training Oscar?
Hi jontooy, We simply changed the configuration --model_name_or_path to bert-base-multilingual-uncased and added --tokenizer_name vinai/phobert-base. You need to change the special tokens to adapt your tokenizer (ex: pad_token in Bert-base-multilingual tokenizer is 0 while pad_token in phobert is 1). That's all we changed and it worked!!! Hope this can help you, good luck
Hello, I'm trying to transfer your model on MSCOCO which was translated into Vietnamese. I got a prediction result that is not relative to the input picture although the training process achieve 0.35 on the Bleu_4 score. I used the Vietnamese tokenizer of PhoBert instead and I also changed the version of pytorch_transformers to 1.0.0 because of PhoBert requirement.
Please help me solve this issue, thanks.