try_mapping_columns tries to do the SimpleColumnMapper by inferring the HPO term from the column name, since I think for a large part, this will correspond correctly to the HPO label, so saves some work, by running this function first, you don't have to do it column for column (only for the ones that remain)
there was a bug in SimpleColumnMapper , because it used set(), that lead to some unwanted behaviour in the matching when observed is a string larger than 1. For instance, in my column, I wanted all entries with Severe to be positive, but it failed to matched first because observed became {'S', 'e', 'r', 'v'}
added a requirements.txt
Hope this helps, if not, feel free to ignore these suggestions
Some suggestions for (minor) improvements:
try_mapping_columns
tries to do theSimpleColumnMapper
by inferring the HPO term from the column name, since I think for a large part, this will correspond correctly to the HPO label, so saves some work, by running this function first, you don't have to do it column for column (only for the ones that remain)SimpleColumnMapper
, because it usedset()
, that lead to some unwanted behaviour in the matching whenobserved
is a string larger than 1. For instance, in my column, I wanted all entries with Severe to be positive, but it failed to matched first becauseobserved
became{'S', 'e', 'r', 'v'}
requirements.txt
Hope this helps, if not, feel free to ignore these suggestions