Open SparkJiao opened 4 years ago
I just checked the link worked on my side, can you double check with it again?
@intersun Hi, thanks for your reply. Indeed the link in normal and I could download the keys-full.tar
. But I have encountered other problems.
keys-full.tar
is wrong. In the makefile, it's saved under ./reddit_extractor/
, but the make command wants to find it under ./reddit_extractor/data/
.keys-full.tar
to the directory ./reddit_extractor/data/
and comment the wget
command and then re-run the demo.py and I got following error report. Is this because the keys-full.tar
file are damaged during downloading or other reasons?
11/05/2019 22:20:46 - INFO - __main__ - Downloading and Extracting Data...
make: *** [data/reddit/RC_2011-02.bz2] Error 4
make: *** Waiting for unfinished jobs....
11/06/2019 01:46:10 - INFO - __main__ - Preparing Data...
prepro.py --corpus ./data/train.tsv --max_seq_len 128
11/06/2019 01:48:21 - INFO - __main__ - Done!
11/06/2019 01:48:21 - INFO - main - Generating training CMD!
Besides, the file `.data/train.tsv` doesn't exist.
Thanks for your help very much!
I had a similar problem, but appears to make progress after re-clone of the repository. I think the process does not like doing "--data full" after doing "--data small".
@intersun Hi, thanks for your reply. Indeed the link in normal and I could download the
keys-full.tar
. But I have encountered other problems.
- I think the path for saving
keys-full.tar
is wrong. In the makefile, it's saved under./reddit_extractor/
, but the make command wants to find it under./reddit_extractor/data/
.- I move the
keys-full.tar
to the directory./reddit_extractor/data/
and comment thewget
command and then re-run the demo.py and I got following error report. Is this because thekeys-full.tar
file are damaged during downloading or other reasons?11/05/2019 22:20:46 - INFO - __main__ - Downloading and Extracting Data... make: *** [data/reddit/RC_2011-02.bz2] Error 4 make: *** Waiting for unfinished jobs.... 11/06/2019 01:46:10 - INFO - __main__ - Preparing Data... prepro.py --corpus ./data/train.tsv --max_seq_len 128 11/06/2019 01:48:21 - INFO - __main__ - Done! 11/06/2019 01:48:21 - INFO - __main__ - Generating training CMD!
Besides, the file
.data/train.tsv
doesn't exist.Thanks for your help very much!
I have the same problem here (Error of RC_2011-02.bz2), although I am using the latest repository. Did you solve this problem?
Hi, great thanks to your contribution!
I try to use
python demo.py --data full
to download the reddit data. For I don't want to train the model now I didn't use the docker. I find that the link to the data is here:https://convaisharables.blob.core.windows.net/lsp/keys-full.tar
It seems that I can't open that even with proxy. So do you have any other link to the reddit data?Sorry to bother you. Thank you very much !