AmenRa / ranx

⚡️A Blazing-Fast Python Library for Ranking Evaluation, Comparison, and Fusion 🐍
https://amenra.github.io/ranx
MIT License
427 stars 23 forks source link

[BUG] Misleading exception message on dataframe types #52

Closed efung closed 11 months ago

efung commented 11 months ago

Describe the bug I'm using the library for the first time with a Pandas dataframe and ran into an exception that was misleading.

To Reproduce Steps to reproduce the behavior:

  1. Create a dataframe where the id column is of type int64 e.g. df['id'] = df.index + 1
  2. Create the qrel like this:
    qrels = Qrels.from_df(
    df=df,
    q_id_col="id",
    doc_id_col="best_document",
    score_col="score",
    )
  3. Observe this error:
    
    [/usr/local/lib/python3.10/dist-packages/ranx/data_structures/qrels.py](https://localhost:8080/#) in from_df(df, q_id_col, doc_id_col, score_col)
    293         """
    294         assert (
    --> 295             df[q_id_col].dtype == "O"
    296         ), "DataFrame scores column dtype must be `object` (string)"
    297         assert (

AssertionError: DataFrame scores column dtype must be object (string)



**Expected behavior**
The assertion message should point to the correct column, in this case, it is the ID column that is of the wrong type. From inspecting the code, the assertion message is wrong when the document ID column is of the wrong type as well.
AmenRa commented 11 months ago

Hi, sorry for that! I probably copy-pasted or duplicated lines there. I will fix it in the next release.

AmenRa commented 11 months ago

Fixed in v0.3.17.

Please, give ranx a star if you haven't yet.