Closed gregtatum closed 3 days ago
Yeah, I'll add a test. Originally I hadn't since it was just a refactor, but this now also adds functionality.
Looks good to me. FWIW - Ben is refactoring this
mounts
stuff on #546
We'll see! I'm not sure if I will be refactoring that in advance, or following up later. In any case, this can land and I will deal with any rebasing needed in my patch(es).
Rebasing this may be hard, and I'm not planning on doing it in the short term. It's still valid, and I may re-open it in the future.
Edit: I added file overrides for this, to support vocabs and other file name mismatches. For instance in
en-fi
the final model was not abest-chrf
but abest-perplexity
. This allows for working around problems in the config itself.I was looking into how training continuation was working, and was confused by some of the mis-direction in the code with iterators and dict comprehension. I refactored the code a bit to understand how things were working. I added some more validation with type-friendly dataclass and enum. I also wrote a few more docs on what was going on.
I didn't end up finishing a test for this, as I didn't want to spend more time on it, but I manually checked the artifacts/full-task-graph.json after running
task preflight-check
.The code produces the equivalent mounts as the original code. I also filed #542 as I realized that ensemble training wasn't actually working.