kunaldahiya / pyxclib

Tools for multi-label classification problems.
MIT License
126 stars 36 forks source link

Precision_k does not work with X np.float32 #16

Closed nsorros closed 3 years ago

nsorros commented 3 years ago

Precision at k does not work with the probabilities matrix as numpy float 32. It throws UnboundLocalError: local variable 'indices' referenced before assignment. It works fine numpy float64. The error originates from https://github.com/kunaldahiya/pyxclib/blob/e8f21309146441c0665ea598d9c8c32c827e2e44/xclib/evaluation/xc_metrics.py#L99

and from the fact that np.float is a shorthand for python float which is a double see https://stackoverflow.com/questions/16963956/difference-between-python-float-and-numpy-float32. This is why np.issubdtype(np.float32, np.float) returns False whereas np.issubdtype(np.float64, np.float) returns true.

This can be solved easily by replacing the comparison with np.floating see the hierarchy of types in numpy https://numpy.org/doc/stable/reference/arrays.scalars.html#arrays-scalars

To reproduce

from xclib.evaluation.xc_metrics import precision

import numpy as np
import scipy.sparse as sp

Y_pred_proba = np.random.randn(10,5).astype(np.float32)
Y_true = sp.csr_matrix(np.random.randn(10,5) > 1).astype(np.int32)

precision(Y_pred_proba, Y_true)