turi-code / python-libffm

A Python wrapper for the libffm library.
BSD 3-Clause "New" or "Revised" License
242 stars 73 forks source link

error 'ValueError: sample larger than populatio' running BPR() model #11

Open rheras opened 7 years ago

rheras commented 7 years ago

HI all,

I've run libfm in Ubuntu using the dataset detailed below, Random model run OK, however when I tried the rest of the models in my list (I followed the models_example.py example provided here), i.e. BPR, TFIDFModel, Popularity, TensorCoFi, ..., an error "ValueError: sample larger than population" is always triggered with every model.

Please, does anyone know what could be the source of this problem? any suggestion? There are many entries in Internet related with this problem in python, but the answers and potential causes described I think doesn't apply this case, so is unclear for me. Btw, the dataset size is bigger than 5K rows...

Thanks in advance, regards, R. ------------------ test:

python modeltest2.py user item rating time title 0 1123 0 2 838985046 NameFilm 1 1107 0 1 838985046 NameFilm 2 1107 0 1 838985046 NameFilm 3 1107 0 2 838985046 NameFilm 4 1107 1 1 838985046 NameFilm 0:00:00.083082 Random [0.262394934911661] 0:00:09.887563 BPR (dim=10,iter=15,reg=0.0001,eta=0.001)

Traceback (most recent call last): File "modeltest2.py", line 57, in print evaluator.evaluate_model(m, testing, all_items=items,) File "build/bdist.linux-x86_64/egg/testfm/evaluation/evaluator.py", line 83, in evaluate_model File "build/bdist.linux-x86_64/egg/testfm/evaluation/evaluator.py", line 30, in partial_measure File "/usr/lib/python2.7/random.py", line 321, in sample raise ValueError("sample larger than population") ValueError: sample larger than population

rheras commented 7 years ago

BTW, running the example provided with the testfm sourcecode (https://github.com/grafos-ml/test.fm/blob/master/src/testfm/examples/models_example.py), with the dataset data/movielenshead.dat

there are some errors as well, with some of the models, but those errors seems are not related with the error above...

Additionally, could anybody confirm whats exactly the output values [0.2877098548938246] returned by the models, what that means? Where can I found this information? Maybe is related with MSE, or EER, ...? Here there are not parameters as Precision or Recall, right? Thanks !

----------- (python 2.7) test1: sysadmin@myUbuntuhost:~/testfm/test.fm-1.0/src/fm$ python modeltest1.py modeltest1.py:20: ParserWarning: Falling back to the 'python' engine because the 'c' engine does not support regex separators (separators > 1 char and different from '\s+' are interpreted as regex); you can avoid this warning by specifying engine='python'.

sep="::", header=None, names=["user", "item", "rating", "date", "title"]) user item rating date title 0 1 122 5.0 838985046 Boomerang (1992) 1 1 185 5.0 838983525 Net, The (1995) 2 1 231 5.0 838983392 Dumb & Dumber (1994) 3 1 292 5.0 838983421 Outbreak (1995) 4 1 316 5.0 838983392 Stargate (1994)

**0:00:00.142133 Random [0.2877098548938246]

0:00:19.330961 BPR (dim=10,iter=15,reg=0.0001,eta=0.001) [0.2886932217273792]**

/home/sysadmin/.local/lib/python2.7/site-packages/testfm-1.0-py2.7-linux-x86_64.egg/testfm/models/content_based.py:110: FutureWarning: iget(i) is deprecated. Please use .iloc[i] or .iat[i] /home/sysadmin/.local/lib/python2.7/site-packages/testfm-1.0-py2.7-linux-x86_64.egg/testfm/models/content_based.py:155: RuntimeWarning: invalid value encountered in double_scalars /home/sysadmin/.local/lib/python2.7/site-packages/testfm-1.0-py2.7-linux-x86_64.egg/testfm/models/content_based.py:155: RuntimeWarning: invalid value encountered in divide

0:00:01.142259 TF/IDF [0.22061728516487217] 0:00:00.004290 Popularity [0.5390989431901517] Traceback (most recent call last): File "modeltest1.py", line 40, in m.fit(training) File "src/testfm/models/cutil/interface.pyx", line 178, in testfm.models.cutil.interface.IFactorModel.fit (src/testfm/models/cutil/interface.c:3745) File "src/testfm/models/cutil/interface.pyx", line 103, in testfm.models.cutil.interface.IModel.fit (src/testfm/models/cutil/interface.c:3037) File "build/bdist.linux-x86_64/egg/testfm/models/tensorcofi.py", line 96, in train File "/usr/lib/python2.7/subprocess.py", line 710, in init errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1327, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory