Open ivyleavedtoadflax opened 3 years ago
Thanks for the report, I can reproduce the behavior where it hangs.
As a workaround, I think it works if you wrap tqdm
around the texts rather than on zip
:
for i, doc in zip(ids, nlp.pipe(tqdm(texts), n_process=2)):
print(doc)
ah nice, thanks @adrianeboyd
As a note, I've marked this as a bug because it shouldn't hang like this, but since there's an easy workaround it's going to be pretty low priority for us to fix.
Maybe some of the changes related to error handling have caused this? I'm not sure. In any case, it's better to use tqdm
on something with a length rather than a generator.
As a note, we've seen that tqdm
can run into deadlocks when errors are raised during the loop. With python 3.12 you can also see the new related deprecation warning related to fork and threading: https://discuss.python.org/t/concerns-regarding-deprecation-of-fork-with-alive-threads/33555
The spacy
test suite would hang on all OSes with python 3.12 prior to 467c82439. (This commit is just a workaround for the test suite / common use cases. It doesn't fix the underlying issue with deadlocks and tqdm
.)
This is a pretty specific set of circumstances that I discovered today, but on the off chance that it is useful to someone, here it is.
If you include the output of
nlp.pipe(...,n_process>1)
in azip()
withintqdm()
it will hang interminably. See belowHow to reproduce the behaviour
Output:
Your Environment
Info about spaCy