Incorrect format - Githubissues

coreylynch / pyFM

Factorization machines in python

921 stars 311 forks source link

Incorrect format #24

Open asmitapoddar opened 7 years ago

asmitapoddar commented 7 years ago

I am trying to use libFM n the Frappe dataset. However, I get the following error on running the code:

Original exception was: Traceback (most recent call last): File "fm.py", line 19, in (train_data, y_train, train_users, train_items)=loadData("traindata.mat") File "fm.py", line 11, in loadData for line in f: File "/usr/lib/python3.5/codecs.py", line 321, in decode (result, consumed) = self._buffer_decode(data, self.errors, final) UnicodeDecodeError: 'utf-8' codec can't decode byte 0xda in position 133: invalid continuation byte

Is there some problem in the input format of the training and/or test dataset? My training and test dataset are in the .mat format

bhavika commented 7 years ago

Hey,

It might be an issue with the input format. You want to look at some of the solutions here - http://stackoverflow.com/questions/19699367/unicodedecodeerror-utf-8-codec-cant-decode-byte

asmitapoddar commented 7 years ago

I used .txt files as input. I ran pyFM on the Frappe data set, where the training and test data set contain the columns: user_id, app_id, rating and context_id (the rating is 1 for each row) and I get FM MSE of 0.0000. Does pyFM work on files with binary ratings?

bhavika commented 7 years ago

Does pyFM work on files with binary ratings?

It should. Do you have a link to the Frappe dataset?

asmitapoddar commented 7 years ago

Yes, this is the link: http://baltrunas.info/data/CARS2_code.zip cars2_frappe_datasplit.mat contain the training, test and validation dataset. Please let me know if the problem is identified.

asmitapoddar commented 7 years ago

How should the the variable preds look? Should it be a vector containing the recommended item_ids?