Task has too many dependencies

mozilla / translations

The code, training pipeline, and models that power Firefox Translations

https://mozilla.github.io/translations/

Mozilla Public License 2.0

154 stars 33 forks source link

Task has too many dependencies #653

Closed eu9ene closed 1 month ago

eu9ene commented 5 months ago

https://firefox-ci-tc.services.mozilla.com/tasks/ZqlokLMTQG-pZPJtK9UnOw/runs/0/logs/public/logs/live.log

Exception: task merge-corpus/merge-corpus-da-en has too many dependencies (105 > 99)

bhearsum commented 5 months ago

The only workaround we have for this is adding tasks between merge-corpus and its upstreams to avoid hitting this limit. We've done this before for the all tasks, but this will bit a little bit different because we need to pull artifacts from the upstream. It's tractable though.

bhearsum commented 4 months ago

We've done this before for the all tasks, but this will bit a little bit different because we need to pull artifacts from the upstream. It's tractable though.

To this point, I'm just going to republish artifacts in the dummy tasks. I looked the bicleaner tasks on one of the large recent training runs, and the artifacts totaled to ~25GB at rest. That costs ~$.40/month to store in GCP, so even if we had 100 of those size runs in a year we're looking at ~$500/year to store them. We can revisit this decision at some point, but I don't think it's worth fussing with an alternate solution for finding artifacts of an indirect upstream at this time.

bhearsum commented 4 months ago

Another solution for this could be to chunk dataset tasks together. If we managed to chunk them by size we could avoid increasing the end to end runtime as well. This might not be great for caching purposes, but I wanted to mention it here for completeness.

bhearsum commented 3 months ago

It turns out that the current limit is likely not a hard limit these days. We're working an removing or greatly increasing this limit in Taskcluster in https://github.com/taskcluster/taskcluster/issues/7151.

bhearsum commented 1 month ago

The Firefox CI cluster now supports up to 10,000 dependencies :partying_face:. We'll still need a taskgraph change to have that allow for it.