Closed hrshdhgd closed 1 month ago
Ok, so here was the problem:
When the dataframe whose redundant rows had to be filtered out had all NaN values for confidence, the line
NaN
https://github.com/mapping-commons/sssom-py/blob/550206721911f711ee678eb1a8da50591649bd04/src/sssom/util.py#L441
returned df = Empty dataframe and the entire source data frame = nan_df.
df
nan_df
Due to this, the following line:
https://github.com/mapping-commons/sssom-py/blob/550206721911f711ee678eb1a8da50591649bd04/src/sssom/util.py#L447
result in dfmax = {} which is of type pandas.Series. Hence the confusion.
dfmax = {}
pandas.Series
The correct way to handle this is simple adding an if statement:
if
https://github.com/mapping-commons/sssom-py/blob/ffa2109616020f994196cbb827d71bca17192014/src/sssom/util.py#L447-L469
I've added an explicit test and it passes. Fixes #546
Ok, so here was the problem:
When the dataframe whose redundant rows had to be filtered out had all
NaN
values for confidence, the linehttps://github.com/mapping-commons/sssom-py/blob/550206721911f711ee678eb1a8da50591649bd04/src/sssom/util.py#L441
returned
df
= Empty dataframe and the entire source data frame =nan_df
.Due to this, the following line:
https://github.com/mapping-commons/sssom-py/blob/550206721911f711ee678eb1a8da50591649bd04/src/sssom/util.py#L447
result in
dfmax = {}
which is of typepandas.Series
. Hence the confusion.The correct way to handle this is simple adding an
if
statement:https://github.com/mapping-commons/sssom-py/blob/ffa2109616020f994196cbb827d71bca17192014/src/sssom/util.py#L447-L469
I've added an explicit test and it passes. Fixes #546