abjadai / catt

The official implementation of CATT Arabic diacritization models.
Other
35 stars 3 forks source link

colloquial dialects #2

Open maherr13 opened 3 months ago

maherr13 commented 3 months ago

Hello Thanks for your work,

you mentioned in the paper that pretraining phase would help in performing good on colloquial dialects.

I wanted to ask how the model would respond to colloquial dialects something like finetuning the pretrained models on Egyptian dialects.

if well, how much data do i need to finetune the model on to get fine good results on such a case.

ahmedbr commented 2 months ago

Hi, I have a similar issues actually.

I've just used the encoder-decoder pretrained model to train a diacritizer for a gulf dialect. I didn't make major changes in the script provided, only provided datasets and pretrained model's paths. But as training proceeds, both val_loss and val_der were getting worse and worse. Please have a look at the screenshot below:

image

Did I do something wrong? Any explanation?