Closed mpearmain closed 8 years ago
Sorry for the delay, I've been unable to check github these last couple of days.
Regarding model.prediction
and the output: all the output that you see from pywFM
is the one that you would see form using libfm
. Regarding the variables that pywFM
outputs, here is a rundown:
-out
file option from libFM
. I do some processing just to convert this file into an arraylibfm
produces if you pass the -save_model
flag (more info here). I do some processing here to split the 3 outputs (given by the same file) into 3 variables.libfm
, and loaded as a pandas dfDoes this answer your question?
Have you tried run libfm
(without the wrapper) to see if the results differ from the ones with pywFM
? Which date did you use to get the 8.13 value?
Thank you for the kind words on the Kaggle thread.
Thanks for the reply
After writing, I did test using libFM on cli only and had the same problem.
Basically the -out isn't producing predictions that relate to the test(LL) as I would expect.
I'm 100% sure this is a user error on i/o usage as I don't think libFM is broken :)
I'll continue to investigate, FYI. I also opened a thread on this, in the libFM Google group On Thu, 31 Mar 2016 at 11:00, João Ferreira Loff notifications@github.com wrote:
Sorry for the delay, I've been unable to check github these last couple of days.
Regarding model.prediction and the output: all the output that you see from pywFM is the one that you would see form using libfm. Regarding the variables that pywFM outputs, here is a rundown:
- predictions: taken from the -out file option from libFM. I do some processing just to convert this file into an array
- global_bias, weights, pairwise_interactions: these 3 are taken from the model file that libfm produces if you pass the -save_model flag (more info here https://github.com/srendle/libfm/commit/19db0d1e36490290dadb530a56a5ae314b68da5d). I do some processing here to split the 3 outputs (given by the same file) into 3 variables.
- rlog: taken from the csv produced form libfm, and loaded as a pandas df
Does this answer your question?
Have you tried run libfm (without the wrapper) to see if the results differ from the ones with pywFM? Which date did you use to get the 8.13 value?
Thank you for the kind words on the Kaggle thread.
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jfloff/pywFM/issues/5#issuecomment-203859876
I saw that thread on the libFM
user group, that's why I asked if you had compared with libFM
alone (not with pywFM
).
Which data are you using to produce the 8.13 value? Could you post the output from that run? You are saying that you used "test predictions against the test label". Shouldn't you be using train data against predictions?
Remember that each time you run libFM
you are running a new model. There is a way to use the same model on a new prediction set, but I haven't done that. Is that what you are looking for?
Like a fool, i didnt set the seed in the script so reproduction of the results isnt easy (i need it in the train_test_split of the data). However simply downloading the data and running the script will highlight the problem (even if results are not the same).
For this case i am trying to produce the predictions that give rise to the
test(LL) while the model is training. In theory (unless i am very much
mistaken) model.prediction
is (as you've stated) the same as the -out
flag from cli. Therefore as we passed both train and test to libFM the
output predictions should be on the test set that was supplied.
So in theory if i run a logloss on the predictions and the labels from the test set i should have the same logloss value as produced in the printed output. This is the crux of the problem, i dont get anything like a close match. (running in standalone mode gives rise to the same problem, i.e its not actually a pywFM issue)
On Thu, 31 Mar 2016 at 11:23 João Ferreira Loff notifications@github.com wrote:
I saw that thread on the libFM user group, that's why I asked if you had compared with libFM alone (not with pywFM).
Which data are you using to produce the 8.13 value? Could you post the output from that run? You are saying that you used "test predictions against the test label". Shouldn't you be using train data against predictions?
Remember that each time you run libFM you are running a new model. There is a way to use the same model on a new prediction set, but I haven't done that. Is that what you are looking for?
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jfloff/pywFM/issues/5#issuecomment-203868746
But the test(LL) are specific for the train/test data you are working with in that moment. From my knowledge, logloss is just an error measure between the predictions and real values. Even if you train a model with 0.5 logloss error, if you then use the same train data but with a test data that might be skewed (say the train data you have is skewed towards false values, and the prediction values you have is skewed towards true values), then you might get a higher logloss value.
Does this help anything at all?
You are correct in the way logloss works, you've actually encapsulated the problem im trying to solve in your first sentence:
"the test(LL) are specific for the train/test data you are working with in that moment."
This is exactly what i want to produce, the final output test(LL) from the model that has been built and applied to the test set that was given. i.e running the simple example yields:
using the model.prediction
via pywFM or -out
via cli with libFM should
provide me with the predicted probabilities (between 0 and 1, which it
does) that the libFM model that was just built has predicted for the test
set provided.
It is the final stage of using these probabilities and the label (y_test) that doesnt return the same result (0.515385 in this case)
Does this make sense?
On Thu, 31 Mar 2016 at 12:03 João Ferreira Loff notifications@github.com wrote:
But the test(LL) are specific for the train/test data you are working with in that moment. From my knowledge, logloss is just an error measure between the predictions and real values. Even if you train a model with 0.5 logloss error, if you then use the same train data but with a test data that might be skewed (say the train data you have is skewed towards false values, and the prediction values you have is skewed towards true values), then you might get a higher logloss value.
Does this help anything at all?
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jfloff/pywFM/issues/5#issuecomment-203879054
But are we talking about using the same data (train and test) giving different values? Did you change the train or the test?
I've added random_state to get reproducible results for the data (downloadable form kaggle) so you can see the issue.
import pandas as pd
import pywFM # Using the python wrapper https://github.com/jfloff/pywFM
from sklearn.metrics import log_loss
from sklearn.cross_validation import train_test_split
random_seed = 1234
print('Load data...')
train = pd.read_csv("./input/train.csv")
target = train['target'].values
train = train.drop(['ID', 'target'], axis=1)
test = pd.read_csv("./input/test.csv")
id_test = test['ID'].values
test = test.drop(['ID'], axis=1)
print('Clearing...')
for (train_name, train_series), (test_name, test_series) in
zip(train.iteritems(), test.iteritems()):
if train_series.dtype == 'O':
# for objects: factorize
train[train_name], tmp_indexer = pd.factorize(train[train_name])
test[test_name] = tmp_indexer.get_indexer(test[test_name])
# but now we have -1 values (NaN)
else:
# for int or float: fill NaN
tmp_len = len(train[train_series.isnull()])
if tmp_len > 0:
# print "mean", train_series.mean()
train.loc[train_series.isnull(), train_name] = -9999
# and Test
tmp_len = len(test[test_series.isnull()])
if tmp_len > 0:
test.loc[test_series.isnull(), test_name] = -9999
xtrain, xtest, ytrain, ytest = train_test_split(train, target,
train_size=0.9, random_state=1234)
clf = pywFM.FM(task='classification',
num_iter=10,
init_stdev=0.1,
k2=5,
learning_method='mcmc',
verbose=False,
silent=False)
model = clf.run(x_train=xtrain, y_train=ytrain, x_test=xtest,
y_test=ytest)
log_loss(ytest, model.predictions, eps=1e-15)
Loading train... has x = 0 has xt = 1 num_rows=102888 num_values=12582078 num_features=131 min_target=0 max_target=1 Loading test... has x = 0 has xt = 1 num_rows=11433 num_values=1397626 num_features=131 min_target=0 max_target=1
Loading meta data... logging to /var/folders/44/q92fcr8n26gc377b_x_4g85m0000gp/T/tmp_jMdqT
Writing FM model...
Out[59]:logloss = 7.3850759798108481
So in this reproducible example 0.46 != 7.385
On Thu, 31 Mar 2016 at 13:39 João Ferreira Loff notifications@github.com wrote:
But are we talking about using the same data (train and test) giving different values? Did you change the train or the test?
— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://github.com/jfloff/pywFM/issues/5#issuecomment-203913335
I guess the key part here is why the line
log_loss(ytest, model.predictions, eps=1e-15)
doesnt equal the final Test(ll)
I haven't used sklearn's log_loss
, but from documentation, it appears that the predictions needs to be in a specific format:
y_pred : array-like of float, shape = (n_samples, n_classes)
Predicted probabilities, as returned by a classifier’s predict_proba method.
(...)
>>> log_loss(["spam", "ham", "ham", "spam"], [[.1, .9], [.9, .1], [.8, .2], [.35, .65]])
0.21616...
Yes, thats correct.
If we pretend there is only one class we can get the same result, akin to the logloss im using in the example above.
`In[66]: log_loss(["spam", "ham", "ham", "spam"], [[ .9], [.1], [.2], [.65]])
Out[66]: 0.21616187468057912`
I guess that this also yields the same result: log_loss(["spam", "ham", "ham", "spam"], [.9, .1, .2, .65])
?
I guess its either how libfm is doing logloss or sklearn? Have you tried computing logloss manually (actually implementing yourself the function, or doing in pen&paper) to compare results? I.e. what actually is the correct result from the two?
Yes, It'd fairly trivial to implement.
import scipy as sp
def logloss(act, pred):
epsilon = 1e-15
pred = sp.maximum(epsilon, pred)
pred = sp.minimum(1-epsilon, pred)
ll = sum(act*sp.log(pred) + sp.subtract(1,act)*sp.log(sp.subtract(1,pred)))
ll = ll * -1.0/len(act)
return ll
OK lets close this issue, as it's clearly something with libFM that im not doing correctly to get the same result.
Thanks very much for you time looking at this
Feel free to chat if you want someone to discuss that issue with :)
I'm experiencing the same problem: log_loss reported by libFM is incorrect.
But it's a libFM problem and not from pywFM correct? Are you getting the same output from libFM (without python wrapper)?
Yes I wrote about it on the libfm github. They're using log10 not log to compute the loss.
Ok great! Could you link that issue here for future reference?
Hi,
I've been testing pywFM package and my question involves understanding how the model.prediction links to the information that is produced in the output
My specific example: If i run libFM with a train and test dataset, i can see in the output test(ll) drops to 0.515385, if i take the predictions and run the test predictions against the test label i get logloss values of 8.134375875846, where i should get 0.515385
For clarity please see the thread i started on Kaggle which also enables you to download the data and reproduce the error.
Full example code: https://www.kaggle.com/c/bnp-paribas-cardif-claims-management/forums/t/19319/help-with-libfm/110652#post110652