navigating-stories / orange-story-navigator

Add-on to the Orange3 data mining toolkit with text processing widgets from the project Navigating Stories
https://research-software-directory.org/projects/navigating-stories
Other
3 stars 2 forks source link

solves #issue82 by fixing the comparison between a Series and a list #83

Closed ThijsVroegh closed 1 month ago

ThijsVroegh commented 2 months ago
Issue

Fixes #issue82.

A previous bugfix around >>combined_df = combined_df[combined_df['category'] not in ['?', 'nan']] << was wrong in the sense that it was trying to check if the entire combined_df['category'] Series is "not in" the list ['?', 'nan']. This boils down to a comparison between a Series and a list, which led to a ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Description of changes

To properly filter the DataFrame by checking if each element in the category column is not in the list ['?', 'nan'], now the .isin() method is used in combination with the negation operator '~'

Includes
ThijsVroegh commented 1 month ago

@eriktks Thanks; I removed the old, commented-out lines of code.