gunthercox / ChatterBot

ChatterBot is a machine learning, conversational dialog engine for creating chat bots
https://chatterbot.readthedocs.io
BSD 3-Clause "New" or "Revised" License
14.03k stars 4.44k forks source link

Chatterbot Ubuntu corpus training error #2031

Closed Benji377 closed 3 years ago

Benji377 commented 4 years ago

I copied and run the example here on Github and waited for it to download but it runs into this strange error:

[nltk_data] Downloading package stopwords to [nltk_data] C:\Users\benbe\AppData\Roaming\nltk_data... [nltk_data] Package stopwords is already up-to-date! [nltk_data] Downloading package averaged_perceptron_tagger to [nltk_data] C:\Users\benbe\AppData\Roaming\nltk_data... [nltk_data] Package averaged_perceptron_tagger is already up-to- [nltk_data] date! Downloading http://cs.mcgill.ca/~jpineau/datasets/ubuntu-corpus-1.0/ubuntu_dialogs.tgz [============== ] Download location: C:\Users\benbe\ubuntu_data\ubuntu_dialogs.tgz Extracting C:\Users\benbe\ubuntu_data\ubuntu_dialogs.tgz Traceback (most recent call last): File "D:/Coding/PyProjects/Sociality/Sociality3/Test/test2/tester.py", line 17, in . trainer.train() File "C:\Users\benbe.virtualenvs\benbe-ZCclz55H\lib\site-packages\chatterbot\trainers.py", line 335, in train self.extract(corpus_download_path) File "C:\Users\benbe.virtualenvs\benbe-ZCclz55H\lib\site-packages\chatterbot\trainers.py", line 319, in extract tar.extractall(path=self.extracted_data_directory, members=track_progress(tar)) File "C:\Users\benbe\AppData\Local\Programs\Python\Python36\lib\tarfile.py", line 2007, in extractall numeric_owner=numeric_owner) File "C:\Users\benbe\AppData\Local\Programs\Python\Python36\lib\tarfile.py", line 2049, in extract numeric_owner=numeric_owner) File "C:\Users\benbe\AppData\Local\Programs\Python\Python36\lib\tarfile.py", line 2119, in _extract_member self.makefile(tarinfo, targetpath) File "C:\Users\benbe\AppData\Local\Programs\Python\Python36\lib\tarfile.py", line 2168, in makefile copyfileobj(source, target, tarinfo.size, ReadError, bufsize) File "C:\Users\benbe\AppData\Local\Programs\Python\Python36\lib\tarfile.py", line 254, in copyfileobj buf = src.read(remainder) File "C:\Users\benbe\AppData\Local\Programs\Python\Python36\lib\gzip.py", line 276, in read return self._buffer.read(size) File "C:\Users\benbe\AppData\Local\Programs\Python\Python36\lib_compression.py", line 68, in readinto data = self.read(len(byte_view)) File "C:\Users\benbe\AppData\Local\Programs\Python\Python36\lib\gzip.py", line 482, in read raise EOFError("Compressed file ended before the " EOFError: Compressed file ended before the end-of-stream marker was reached

Process finished with exit code 1

ArctoranDev commented 4 years ago

This is usually because it's telling you the archive runs out before it's meant to most likely meaning it's corrupted or incomplete so you probably have to redownload it.

SannanOfficial commented 3 years ago

@ArctoranDev From where should I download it? and where to put the downloaded file(s)?

Benji377 commented 3 years ago

@SannanOfficial you can download it using the instructions here: https://pypi.org/project/ChatterBot/

Benji377 commented 3 years ago

Oh and I completely forgot to say that redownloading it solved my issue, so this can be closed