HiTrong / Transformer-EntoVn-Translator

This project will use a Neural Networks Model with Transformer Architecture to make a translator English to Vietnamese.

MIT License

1 stars 0 forks source link

Issue 02: Imbalance, always generate END TOKEN or Nothing,... #2

Closed HiTrong closed 3 months ago

HiTrong commented 3 months ago

The model exhibits a strong bias towards generating the END token prematurely or failing to generate meaningful sequences. This issue likely arises due to imbalanced training data, where END tokens are overrepresented, leading the model to favor them.

HiTrong commented 3 months ago

When i checked again the dataset from hugging face (link), I realized that the dataset had a lot of errors. For example: chinese, special symbol, wrong translation,... Specially, the distribution of the lengths is not the same. The length is skewed towards the short side. Next, I checked agian the architecture. My Positional Encoding Function has not been set 'requiresgrad(False)' since my building model. Finally, I decided to remake my building model and chose other dataset which was collected completely correctly.

HiTrong commented 3 months ago

Solution:

New dataset: ncduy. (n.d.). mt-en-vi. Hugging Face. Retrieved June 14, 2024, from https://huggingface.co/datasets/ncduy/mt-en-vi
Carefully Preprocessing (using some NLP framework supporting to Vietnamese and English)
New code to building model!
Try different training with different batchsize, model config,...