mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
154 stars 33 forks source link

Requirements files should be a key in the kind.yml, and they should be installed with a transform #588

Open gregtatum opened 5 months ago

gregtatum commented 5 months ago

This will help with our local testing config, and simplify our kind.yml files.

Something like:

diff --git a/taskcluster/kinds/evaluate-teacher-ensemble/kind.yml b/taskcluster/kinds/evaluate-teacher-ensemble/kind.yml
index cb945c5..0bace02 100644
--- a/taskcluster/kinds/evaluate-teacher-ensemble/kind.yml
+++ b/taskcluster/kinds/evaluate-teacher-ensemble/kind.yml
@@ -22,6 +22,7 @@ kind-dependencies:
 tasks:
     "{provider}-{dataset}-{src_locale}-{trg_locale}":
         description: teacher evaluation for {dataset} {src_locale}-{trg_locale}
+        python-requirements: $VCS_PATH/pipeline/eval/requirements/eval.txt
         attributes:
             stage: evaluate-teacher-ensemble
             dataset-category: test
gregtatum commented 2 months ago

This involves manipulating the task graph that is generated. The task graph is a JSON file that describes all of the information to run our pipeline. This is a good first bug if you want to figure out how to work with the taskgraph. It's probably a bit more involved.

You can generate the graph locally with:

task preflight-check -- --only task_group

Then open:

artifacts/full-task-graph.json

You can also diff the taskgraph to see how the results have changed by first committing your work, and then run:

BASE_REV=main task taskgraph-diff

The kind.yml files describe an individual task, and include a command: section of what is run.

For example:

https://github.com/mozilla/firefox-translations-training/blob/d5b94fe42263597325720b2e5f7b86915d080f3b/taskcluster/kinds/evaluate-teacher-ensemble/kind.yml#L84-L115

The work here would be to write a "transform" that would add the requirements installation to the command section. I would look through all the implementations of "command:" to get an idea how it works.

A good example of a "transform" is the "cast_to" transform. The implementation is here:

https://github.com/mozilla/firefox-translations-training/blob/d5b94fe42263597325720b2e5f7b86915d080f3b/taskcluster/translations_taskgraph/transforms/cast_to.py

And it can be seen in use here:

https://github.com/mozilla/firefox-translations-training/blob/d5b94fe42263597325720b2e5f7b86915d080f3b/taskcluster/kinds/evaluate/kind.yml#L165

You can also refer to the taskgraph documentation: https://taskcluster-taskgraph.readthedocs.io/en/latest/