Closed cool-RR closed 4 years ago
Thanks! The _text.py is a vendorized module from pattern.en. Usually I'd suggest making the change upstream as well, but it seems that the pattern library isn't actively maintained. So I think it's OK for things to diverge here.
No other action necessary other than adding yourself to AUTHORS.rst
? Can you do that please?
I didn't notice that, thanks for checking.
I ran the tests and got an error on test_tokenize_with_multiple_punctuation
, but I see the same error on the dev branch, so I'm guessing it's unrelated.
As far as I know we can move forward with this PR.
This is a faster and more idiomatic way of using
itertools.chain
. Instead of computing all the items in the iterable and storing them in memory, they are computed one-by-one and never stored as a huge list. This can save on both runtime and memory space.