Closed SeekPoint closed 7 years ago
Files are here: http://2015.recsyschallenge.com/challenge.html so
curl -Lo yoochoose-data.7z https://s3-eu-west-1.amazonaws.com/yc-rdata/yoochoose-data.7z
7z x yoochoose-data.7z
7-Zip [64] 9.20 Copyright (c) 1999-2010 Igor Pavlov 2010-11-18
p7zip Version 9.20 (locale=C,Utf16=off,HugeFiles=on,8 CPUs)
Processing archive: yoochoose-data.7z
Extracting yoochoose-buys.dat
Extracting yoochoose-clicks.dat
Extracting yoochoose-test.dat
Extracting dataset-README.txt
Everything is Ok
Files: 4
Size: 1914111754
Compressed: 287211932
where training file are yoochoose-clicks.dat and yoochoose-buys.dat, while yoochoose-test.dat is the test file.
Now in the scripts we have
PATH_TO_TRAIN = '/path/to/rsc15_train_full.txt'
PATH_TO_TEST = '/path/to/rsc15_test.txt'
I'm not completely sure of the training and test files here considering the available dataset @hidasib
@hidasib ok thank you I was able to pre-process the dataset
root@d842fc00a358:~/GRU4Rec/examples/rsc15# python preprocess.py
Full train set
Events: 31637239
Sessions: 7966257
Items: 37483
Test set
Events: 71222
Sessions: 15324
Items: 6751
Train set
Events: 31579006
Sessions: 7953885
Items: 37483
Validation set
Events: 58233
Sessions: 12372
Items: 6359
rzai@rzai00:~/prj/GRU4Rec/examples/rsc15$ python run_rsc15.py Using gpu device 0: GeForce GTX 1080 (CNMeM is disabled, cuDNN 5105) Traceback (most recent call last): File "run_rsc15.py", line 20, in
data = pd.read_csv(PATH_TO_TRAIN, sep='\t', dtype={'ItemId':np.int64})
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 470, in parser_f
return _read(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 246, in _read
parser = TextFileReader(filepath_or_buffer, kwds)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 562, in init
self._make_engine(self.engine)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 699, in _make_engine
self._engine = CParserWrapper(self.f, self.options)
File "/usr/local/lib/python2.7/dist-packages/pandas/io/parsers.py", line 1066, in init
self._reader = _parser.TextReader(src, **kwds)
File "pandas/parser.pyx", line 350, in pandas.parser.TextReader.cinit (pandas/parser.c:3163)
File "pandas/parser.pyx", line 583, in pandas.parser.TextReader._setup_parser_source (pandas/parser.c:5779)
IOError: File /path/to/rsc15_train_full.txt does not exist
rzai@rzai00:~/prj/GRU4Rec/examples/rsc15$