Open guidev opened 3 months ago
Apparently it seems to be enough to just modify sparse\bm25_tokenizer.py
, replacing punkt
with punkt_tab
.
@staticmethod
def nltk_setup() -> None:
try:
nltk.data.find("tokenizers/punkt_tab")
except LookupError:
nltk.download("punkt_tab")
try:
nltk.data.find("corpora/stopwords")
except LookupError:
nltk.download("stopwords")```
Is this a new bug?
Current Behavior
nltk.download("punkt")
fails for nltk v 3.9.xHere's a full explanation https://github.com/nltk/nltk/issues/3293
Expected Behavior
pinecone-text should work with the latest nltk version
Steps To Reproduce
https://github.com/nltk/nltk/issues/3293
Relevant log output
No response
Environment
Additional Context
No response