Add file overrides to the training continuation, and refactor the implementation

gregtatum commented 6 months ago

Edit: I added file overrides for this, to support vocabs and other file name mismatches. For instance in en-fi the final model was not a best-chrf but a best-perplexity. This allows for working around problems in the config itself.

I was looking into how training continuation was working, and was confused by some of the mis-direction in the code with iterators and dict comprehension. I refactored the code a bit to understand how things were working. I added some more validation with type-friendly dataclass and enum. I also wrote a few more docs on what was going on.

I didn't end up finishing a test for this, as I didn't want to spend more time on it, but I manually checked the artifacts/full-task-graph.json after running task preflight-check.

The code produces the equivalent mounts as the original code. I also filed #542 as I realized that ensemble training wasn't actually working.

gregtatum commented 6 months ago

Yeah, I'll add a test. Originally I hadn't since it was just a refactor, but this now also adds functionality.

bhearsum commented 6 months ago

Looks good to me. FWIW - Ben is refactoring this mounts stuff on #546

We'll see! I'm not sure if I will be refactoring that in advance, or following up later. In any case, this can land and I will deal with any rebasing needed in my patch(es).

gregtatum commented 3 days ago

Rebasing this may be hard, and I'm not planning on doing it in the short term. It's still valid, and I may re-open it in the future.

mozilla / translations

Add file overrides to the training continuation, and refactor the implementation #543