EleutherAI / the-pile

MIT License
1.46k stars 126 forks source link

BookCorpus download not working #50

Closed aveni closed 3 years ago

aveni commented 3 years ago

When I try to download the bookcorpus dataset, my connection keeps getting closed, and it eventually gives up:

Connecting to battle.shawwn.com (battle.shawwn.com)|2606:4700:3033::681b:80c6|:443... connected.                                                                                                                   
HTTP request sent, awaiting response... 206 Partial Content                                                                                                                                                        
Length: 2404269430 (2.2G), 2402333974 (2.2G) remaining [application/gzip]                                                                                                                                          
Saving to: ‘books1.tar.gz’                                                                                                                                                                                         

books1.tar.gz                0%[                                      ]   1.95M   221KB/s    in 0.5s                                                                                                               

2020-09-25 11:34:47 (221 KB/s) - Connection closed at byte 2042976. Retrying.                                                                                                                                      

--2020-09-25 11:34:57--  (try:20)  https://battle.shawwn.com/sdb/books1/books1.tar.gz                                                                                                                              
Connecting to battle.shawwn.com (battle.shawwn.com)|2606:4700:3033::681b:80c6|:443... connected.                                                                                                                   
HTTP request sent, awaiting response... 206 Partial Content
Length: 2404269430 (2.2G), 2402226454 (2.2G) remaining [application/gzip]
Saving to: ‘books1.tar.gz’

books1.tar.gz                0%[                                      ]   2.05M   222KB/s    in 0.5s

2020-09-25 11:34:58 (222 KB/s) - Connection closed at byte 2150496. Giving up.

Is anyone else having this problem?

aveni commented 3 years ago

I messaged the host, turns out the server was out of disk space. The download works fine now :)

https://twitter.com/theshawwn/status/1309667060176175105?s=20