mozilla / translations

The code, training pipeline, and models that power Firefox Translations
https://mozilla.github.io/translations/
Mozilla Public License 2.0
155 stars 34 forks source link

clean corpus unable to fork sometimes on generic-worker/d2g #640

Closed bhearsum closed 4 months ago

bhearsum commented 6 months ago

For example:

[task 2024-05-24T00:19:51.583Z] ### Cleaning /builds/worker/fetches/NLLB_v1 with filter /builds/worker/artifacts/NLLB_v1.en-uk.filters.json
[task 2024-05-24T00:19:51.584Z] + opuscleaner-clean --parallel 32 --batch-size=50000 --input=- /builds/worker/artifacts/NLLB_v1.en-uk.filters.json en uk
[task 2024-05-24T00:19:51.584Z] + cut -f2
[task 2024-05-24T00:19:51.584Z] ++ zstdmt -dc /builds/worker/fetches/NLLB_v1.en.zst
[task 2024-05-24T00:19:51.584Z] + paste /dev/fd/63 /dev/fd/62
[task 2024-05-24T00:19:51.584Z] + zstdmt
[task 2024-05-24T00:19:51.584Z] + tee /dev/fd/63
[task 2024-05-24T00:19:51.584Z] ++ zstdmt -dc /builds/worker/fetches/NLLB_v1.uk.zst
[task 2024-05-24T00:19:51.585Z] ++ cut -f1
[task 2024-05-24T00:19:51.585Z] ++ zstdmt
[task 2024-05-24T00:19:56.663Z] [24/4:deescape-special-chars] Error: [Errno 11] Resource temporarily unavailable
[task 2024-05-24T00:19:56.663Z] [24/1:normalize_whitespace] Error: can't start new thread
[task 2024-05-24T00:19:56.664Z] [24/2:normalize_whitespace] Error: can't start new thread
[task 2024-05-24T00:19:56.670Z] [24/0:remove_empty_lines] Traceback (most recent call last):
[task 2024-05-24T00:19:56.670Z] [24/0:remove_empty_lines]   File "/builds/worker/.local/lib/python3.10/site-packages/opuscleaner/filters/./remove_empty_lines.py", line 16, in <module>
[task 2024-05-24T00:19:56.672Z] [24/0:remove_empty_lines]     main()
[task 2024-05-24T00:19:56.672Z] [24/0:remove_empty_lines]   File "/builds/worker/.local/lib/python3.10/site-packages/opuscleaner/filters/./remove_empty_lines.py", line 13, in main
[task 2024-05-24T00:19:56.672Z] [24/0:remove_empty_lines]     sys.stdout.write(line)
[task 2024-05-24T00:19:56.672Z] [24/0:remove_empty_lines] BrokenPipeError: [Errno 32] Broken pipe
[task 2024-05-24T00:19:56.787Z] [25/3:deescape-special-chars] Error: can't start new thread
[task 2024-05-24T00:19:56.787Z] [25/6:max_length] /bin/sh: 1: Cannot fork

(From https://firefox-ci-tc.services.mozilla.com/tasks/eYPAybjqQG27x-Y3SxespA)

Seems like it might be some sort of resource allocation issue with podman? Unclear to me so far.

bhearsum commented 4 months ago

Same root cause as #630.