Helsinki-NLP / OPUS-MT-train

Training open neural machine translation models
MIT License
318 stars 40 forks source link

How long to build a pair of languages? e.g: en-ko #45

Open hthanhbmt opened 3 years ago

hthanhbmt commented 3 years ago

I'm building a pair of languages (en-ko) and I have been building it for 30 days. Now, it's starting epoch 19 I don't know when it's finished. Can anyone give me information about how long it takes you to build a pair of languages with your computer's configuration?

Here my computer's configuration I use to build: CPU i5 7200U 2 cores/ 2.5 GHz and I use the configuration default of the OPUS-MT-train repository

jorgtied commented 3 years ago

Are you running on CPU? I guess that you have a GPU available on that machine. Otherwise it will take forever. I ran an English-Korean model and it took about 1 day to do 30 epochs. It's available here: https://github.com/Helsinki-NLP/Tatoeba-Challenge/tree/master/models/kor-eng

hthanhbmt commented 3 years ago

Yes, I'm running on CPU with 2 cores. Thank you for your reply, otherwise, I have to wait 30 days at least. Let me try to build it on GPU.

hthanhbmt commented 3 years ago

I just checked through Marian-nmt's documents, it seems to only support CUDA. I am using the GPU of AMD :(

mike-goitia commented 3 years ago

@hthanhbmt have you tried training on a sagemaker instance. You can select CUDA supported instances with NVIDIA GPU.

jorgtied commented 3 years ago

I just realized that something must be seriously wrong with the training data. The model I trained for English-Korean only produces nonsense. Do you have the same experience? I started to investigate ...

jorgtied commented 3 years ago

The problem I encounter seems to be related to zero-width space characters. They appear frequently in training data but the test data does not seem to have them. Would it be OK to safely ignore them and to remove all of those characters?

hthanhbmt commented 3 years ago

@hthanhbmt have you tried training on a sagemaker instance. You can select CUDA supported instances with NVIDIA GPU. Thank you. I don't have a computer with Nvidia GPU, but training with CPU is too long. Maybe I'll rent a server with Nividia GPU and retraining on it.

hthanhbmt commented 3 years ago

The problem I encounter seems to be related to zero-width space characters. They appear frequently in training data but the test data does not seem to have them. Would it be OK to safely ignore them and to remove all of those characters?

Sr. I don't have anything experiences with Training Model

hthanhbmt commented 3 years ago

Can anyone tell me how many epochs it takes to finish training?

jorgtied commented 3 years ago

It really depends on the training data. But you don't have to wait for full convergence. You can also test a model that is not fully converged after 10 epochs or so. en-ko was somehow difficult to work with I remember ....

hthanhbmt commented 1 year ago

It really depends on the training data. But you don't have to wait for full convergence. You can also test a model that is not fully converged after 10 epochs or so. en-ko was somehow difficult to work with I remember ....

Yes, I think eng-kor has some problems. Now I'm training eng-kor again, but I have some problems: