Open eu9ene opened 5 months ago
This task exited with 3
, which is simply being rereported at the end of the log.
I'm pretty sure the problem is this error which happens 3 times:
[task 2024-05-24T17:04:06.806Z] 2024-05-24 17:04:01,834 - WARNING - Downloading FastText model...
[task 2024-05-24T17:04:06.806Z] Traceback (most recent call last):
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/bin/bicleaner-ai-classify", line 8, in <module>
[task 2024-05-24T17:04:06.806Z] sys.exit(main())
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/lib/python3.10/site-packages/bicleaner_ai/bicleaner_ai_classifier.py", line 119, in main
[task 2024-05-24T17:04:06.806Z] perform_classification(args) # Main loop
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/lib/python3.10/site-packages/bicleaner_ai/bicleaner_ai_classifier.py", line 108, in perform_classification
[task 2024-05-24T17:04:06.806Z] nline = classify(args, args.input, args.output)
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/lib/python3.10/site-packages/bicleaner_ai/classify.py", line 220, in classify
[task 2024-05-24T17:04:06.806Z] hardrules = Hardrules(args)
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/lib/python3.10/site-packages/hardrules/hardrules.py", line 115, in __init__
[task 2024-05-24T17:04:06.806Z] self.fastspell_src = FastSpell(args.source_lang, mode="aggr")
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/lib/python3.10/site-packages/fastspell/fastspell.py", line 76, in __init__
[task 2024-05-24T17:04:06.806Z] self.download_fasttext()
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/lib/python3.10/site-packages/fastspell/fastspell.py", line 93, in download_fasttext
[task 2024-05-24T17:04:06.806Z] self.model = fasttext.load_model(ft_model_path)
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/lib/python3.10/site-packages/fasttext/FastText.py", line 441, in load_model
[task 2024-05-24T17:04:06.806Z] return _FastText(model_path=path)
[task 2024-05-24T17:04:06.806Z] File "/home/ubuntu/.local/lib/python3.10/site-packages/fasttext/FastText.py", line 98, in __init__
[task 2024-05-24T17:04:06.806Z] self.f.loadModel(model_path)
[task 2024-05-24T17:04:06.806Z] ValueError: /home/ubuntu/.local/lib/python3.10/site-packages/fastspell/lid.176.bin has wrong file format!
That script is run through parallel
, which exits as follows:
1-100 Some of the jobs failed. The exit status gives the number of failed jobs. If Y% is used the
exit status is the percentage of jobs that failed.
Sometimes fast text fails to download the model. I ran into this issue in OpusCleaner and fixed it with pre-downloading https://github.com/mozilla/firefox-translations-training/blob/6b6b64999edee0e5bb5822bfb6d6e9d6a4e6c94f/pipeline/clean/opuscleaner/clean-corpus.sh#L34
If this helps, you can run fastspell-download
command during installation and that will download the model to the pythonpath.
https://firefox-ci-tc.services.mozilla.com/tasks/IVGFh-gRSaOmOGjkaJGqsw/runs/0/logs/public/logs/live.log