vectara / vectara-ingest

An open source framework to crawl data sources and ingest into Vectara
https://vectara.com
Apache License 2.0
141 stars 48 forks source link

Resource punkt not found #7

Closed sunddytwo closed 9 months ago

sunddytwo commented 1 year ago

somebody help.below is the error message:

[nltk_data] Error loading punkt: <urlopen error [Errno 99] Cannot [nltk_data] assign requested address> [nltk_data] Error loading averaged_perceptron_tagger: <urlopen error [nltk_data] [Errno 99] Cannot assign requested address> 2023-06-16 12:29:51,871 INFO Starting crawl of type rss... 2023-06-16 12:29:54,573 INFO Found 217 URLs to index from the last 365 days (pg) 2023-06-16 12:30:20,804 INFO Reading document from string ... 2023-06-16 12:30:20,819 INFO Reading document ... 2023-06-16 12:30:20,835 ERROR Error while indexing http://www.paulgraham.com/getideas.html:


Resource punkt not found. Please use the NLTK Downloader to obtain the resource:

import nltk nltk.download('punkt')

For more information see: https://www.nltk.org/data.html

Attempted to load tokenizers/punkt/PY3/english.pickle

Searched in:

ofermend commented 1 year ago

This should be working okay now - are you seeing this issue still with the latest version of vectara-ingest?

ofermend commented 9 months ago

Closing for no activity and should be fixed by now.