Function prep_outcomes_freq works differently based on the dtypes of samples. This can potentially be refactored (have two functions instead of one, or have an additional helper function), to avoid checking isinstance (confusing IMO).
# Samples can be a list of csv files or a dataframe
if isinstance(samples, str):
data = pd.concat(
[
read_csv_file(sample, usecols=["smiles", "size"])
for sample in [samples, known_smiles, invalid_smiles]
]
)
else:
data = samples
Function
prep_outcomes_freq
works differently based on the dtypes ofsamples
. This can potentially be refactored (have two functions instead of one, or have an additional helper function), to avoid checkingisinstance
(confusing IMO).