A previous bugfix around >>combined_df = combined_df[combined_df['category'] not in ['?', 'nan']] << was wrong in the sense that it was trying to check if the entire combined_df['category'] Series is "not in" the list ['?', 'nan']. This boils down to a comparison between a Series and a list, which led to a ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Description of changes
To properly filter the DataFrame by checking if each element in the category column is not in the list ['?', 'nan'], now the .isin() method is used in combination with the negation operator '~'
Issue
Fixes #issue82.
A previous bugfix around >>combined_df = combined_df[combined_df['category'] not in ['?', 'nan']] << was wrong in the sense that it was trying to check if the entire combined_df['category'] Series is "not in" the list ['?', 'nan']. This boils down to a comparison between a Series and a list, which led to a ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Description of changes
To properly filter the DataFrame by checking if each element in the category column is not in the list ['?', 'nan'], now the .isin() method is used in combination with the negation operator '~'
Includes