-
NLTK version 3.8.2 changed the data format of the tokenizers from pickle to text files in order to patch a vulnerability (CVE-2024-39705).
Here's the PR in the nltk repo:
https://github.com/nltk/n…
-
## Problem
Testing the integration with LangChain shows intermittent errors on scheduled runs, starting three weeks ago, going green in between for three runs, and going red again afterwards.
See:…
-
### 是否已有关于该错误的issue或讨论? | Is there an existing issue / discussion for this?
- [X] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions
### 该问题是否在FAQ中有解答? | Is there an existing ans…
-
Currently we use the following in our venv for testing:
```
python -m pip install textblob
python -m textblob.download_corpora
```
We should replace the venv hack and instead use the follow…
-
punkt is loaded in as a pickle file which is not secure CVE-2024-39705 so you have to use punkt_tab now.
This breaks `_get_sentence_tokenizer`.
In order to use the Tokeniser class I had to overrid…
-
Hello there, I was looking at your NOTICE file and I see that nltk is vendored. If I'm not mistaken, that isn't true anymore since 0.9.0? If so, should the nltk license be removed from the NOTICE file…
-
[nltk_data] Error loading averaged_perceptron_tagger:
[nltk_data] Error loading cmudict:
Traceback (most recent call last):
File "/home/vipuser/anaconda3/envs/GPTSoVits/lib/python3.9/site-packa…
-
-
Hi,
I have been using NLTK version 3.8.1 for some time without any issues. However, after recently updating to version 3.9.1, I encountered an error when using the word_tokenize function. I would …
-
### Description
On Linux Ubuntu
```
docker run -e GRADIO_SERVER_NAME=0.0.0.0 -e GRADIO_SERVER_PORT=7860 -p 7860:7860 -v $(pwd):/app -it --rm --name rag_app taprosoft/kotaemon:v1.0
Warning: Can…