kunaldahiya / pyxclib

Tools for multi-label classification problems.
MIT License
126 stars 36 forks source link

Bug in _get_topk #9

Closed wenting-zhao closed 3 years ago

wenting-zhao commented 3 years ago

Hi, thanks for making this useful tool! Looks great.

Here is a minimal working example of how I can produce the bug:

  1 import numpy as np
  2 import xclib.evaluation.xc_metrics as xc_metrics
  3 
  4 scores = np.array([[0.1, 0.2, 0.99, 0.2]])
  5 labels = np.array([[0, 0, 1, 1]])
  6 
  7 inv_propen = xc_metrics.compute_inv_propesity(labels, 0.55, 1.5)
  8 acc = xc_metrics.Metrics(true_labels=labels, inv_psp=inv_propen)
  9 args = acc.eval(scores, 1)
 10 print(xc_metrics.format(*args))

What I get is

  File "tmptest.py", line 9, in <module>
    args = acc.eval(scores, 1)
  File "/home/fs01/wz346/.local/lib/python3.7/site-packages/xclib-0.96-py3.7-linux-x86_64.egg/xclib/evaluation/xc_metrics.py", line 446, in eval
    self.inv_psp, k=K)
  File "/home/fs01/wz346/.local/lib/python3.7/site-packages/xclib-0.96-py3.7-linux-x86_64.egg/xclib/evaluation/xc_metrics.py", line 162, in _setup_metric
    num_labels, k)
  File "/home/fs01/wz346/.local/lib/python3.7/site-packages/xclib-0.96-py3.7-linux-x86_64.egg/xclib/evaluation/xc_metrics.py", line 118, in _get_topk
    return indices
UnboundLocalError: local variable 'indices' referenced before assignment

Looking at it closely, it seems like the if condition here fails to capture the np.ndarray type, so the indices never got created

    elif type(X) == np.ndarray:
        if np.issubdtype(X.dtype, np.integer):
            warnings.warn("Assuming indices are sorted.")
            indices = X[:, :k]
        elif np.issubdtype(X.dtype, np.float):
            _indices = np.argpartition(X, -k)[:, -k:]
            _scores = np.take_along_axis(
                X, _indices, axis=-1
            )
            indices = np.argsort(_scores, axis=-1)
            indices = np.take_along_axis(_indices, indices, axis=1)

where X is created here in the setup_metric():

    if inv_psp is not None:
        ps_indices = _get_topk(
            true_labels.dot(
                sp.spdiags(inv_psp, diags=0,
                           m=num_labels, n=num_labels)),
            num_labels, k)

Any ideas how I may fix it? Thanks for any help you may be able provide.

anshumitts commented 3 years ago

Hi @wenting-zhao. Thank you for reaching out. We have updated pyxclib. This should now work as expected. Anshul Mittal

qixiang109 commented 3 years ago

the bug still exists, since np.issubdtype(X.dtype, np.float)) returns True for float64 but False for float32

anshumitts commented 3 years ago

@qixiang109 , Could you please share a code snippet to reproduce this issue? Also, make sure you have the updated version of pyxclib.

anshumitts commented 3 years ago

Since no new updates were given and module works at our end