hplt-project / OpusPocus

Marian machine translation training pipeline for thousands of models
2 stars 0 forks source link

Add robustness towards pipeline.config change of an inited pipeline. #46

Open varisd opened 2 months ago

varisd commented 2 months ago

Right now, adjusting pipeline.config (generated during pipeline init) does not have any effect on the existing pipeline (serves only an informational purpose).

A feasible feature would be a support of editing pipeline.config. This would require:

What should be the default response when "inconsinstencies" are detected? Do we:

bhaddow commented 2 months ago

The third option would be the behaviour I would expect of a tool like this (eg make, snakemake). If a step has changed, then it should be re-run. If other steps are dependent on this (according to the DAG) then they should be rerun. If I fix the evaluation script, I really don't want the pipeline to re-run training.

Options 1 and 2 may even make the situation slightly worse. At the moment, if I make a change to the pipeline config, I have to delete and rerun. If I make a trivial change that I know will not affect the rerunning, then I can avoid the delete. If we implement options 1 or 2 then the system may force me to delete and rerun (even though I know that delete is not needed).

Connected with this issue, why do we have init and run as separate commands? I would think that run and dry-run would make more sense. dry-run is the same as run, except that it does not execute anything.

varisd commented 2 months ago

You make a good point about the rerunning of modified steps. In practice, codewise, it should not be more complicated, only inefficient part would be reinitializing of steps that have the change step as dependency (the graph has only unidirectional pointers at the moment). The usual graphs should usually be rather small so that should not be a big issue.

Regarding changing "init" to "dry-run" and having "run" command to do both initialization and running: I support this change since right now there is really no practical reason to "execute" the pipeline in two steps (init and run). The CLI mainly copied the internal implementation, where separating init and run kinda makes sense. I propose opening another issue regarding this change - it only requires some adjustments to the opuspocus_cli/ scripts (rest of the codebase can and should stay the same).