explosion / spacy-stanza

💥 Use the latest Stanza (StanfordNLP) research models directly in spaCy
MIT License
723 stars 59 forks source link

User Warnings make parsing Late #52

Closed chaouiy closed 9 months ago

chaouiy commented 3 years ago

I am parsing a big corpus that takes days to index. It is an arabic corpus so I need spacy-stanza. I have noticed that it is printing for each sentence I parse UserWarning: Can't set named entities because of multi-word token expansion or because the character offsets don't map to valid tokens produced by the Stanza tokenizer This makes the parsing a lot slower. I suggest to remove these warnings

adrianeboyd commented 3 years ago

Hi, you can use python warnings filters to manage how these warnings are handled: https://docs.python.org/3/library/warnings.html#the-warnings-filter

isaac47 commented 3 years ago

Hi, I still have the same ussue. How to simply shut up all UserWarning?

polm commented 3 years ago

You can call code with python -W ignore script.py to turn off all warnings (docs). This is not special or unique to spaCy.

adrianeboyd commented 9 months ago

Just going through some older issues...

It sounds like this was resolved, but please feel free to reopen if you're still running into issues!