hplt-project / OpusTrainer

Curriculum training
https://pypi.org/project/opustrainer/
MIT License
16 stars 5 forks source link

Add a new stage starting in the training log #47

Open onadegibert opened 9 months ago

onadegibert commented 9 months ago

Hi there,

I am using OpusTrainer for training some multilingual models, with different stages focused on a specific language pair. I am also keeping the valid log files so that later I can plot the learning curves per language pair. I would expect that whenever a new stage for a new language pair starts, that would be shown in the learning curves, with a steeper curve for that specific language pair. I would like to link the stages with the model updates. However, when reading from stin, Marian cannot give you epochs information. So, how do you link epochs and updates? So I would basically like to know when a new stage starts in the log training file.

Thanks in advance! :)

jelmervdl commented 9 months ago

The trainer should already be telling you about when it switches to a new stage, even with the default logging: https://github.com/hplt-project/OpusTrainer/blob/06e404dd675bc4381da073f3548c7a525fb7a610/src/opustrainer/trainer.py#L703

how to link it to Marian’s progress is a bit trickier I think. One way would be to match the timestamps of Marian’s and trainers log messages. Or you could make Marian print log to stderr so they get combined with the one from the trainer.

There will be a severe lag though. The trainer will have moved onto the next stage, fill the maxi batch of Marian, and only once those sentences pass through the maxi batch into the mini batches will you start seeing a difference in the learning curve I’d expect.

I don’t have a good idea how to improve upon this. Maybe @XapaJIaMnu or @jorgtied have some ideas?