Closed ventouris closed 4 years ago
Hey there, Is the question how to compute the rank metrics from the predictions directly in a script outside of fxml.py script?
Exactly. In the example above, where I run manually the FastXML (not in fxml.py), how can I compute the ndcg metric?
Got it :)
The way the ndcg function works in fastxml is it takes an ordered list of relevancy scores and computes the ndcg:
For example:
scores = [1,2,0,3,2]
ndcg(scores, 3)
Computes the ndcg@3 for those relevancy scores where the score 1
is in the highest rank.
To adapt your y_pred pretty easily, let's assume the following example:
Y = {117, 31, 12}
Y_pred = clf.predict(X, 'dict') # You can use sparse as well, but you need to do an argmax which is a bit more code
# Assuming you only have one example you're predicting
relevancy_scores = [1 if cls_idx in Y else 0 for cls_idx in Y_pred[0].keys()]
ndcg(relevancy_scores, 5)
Thank you. It's working
I am a little bit lost. I saw in the bin/fxml.py, that you predict ndcg with other performance metrics as well. However, using variable names that I don't understand is difficult for me to reproduce it.
This is what I do, where I have
X_train, y_train, X_test and y_test
dataframes. The prediction is working, however, I am not sure how to proceed and use your functions to get the ndcg. Any idea?