Consider refactoring function

Function prep_outcomes_freq works differently based on the dtypes of samples. This can potentially be refactored (have two functions instead of one, or have an additional helper function), to avoid checking isinstance (confusing IMO).

    # Samples can be a list of csv files or a dataframe
    if isinstance(samples, str):
        data = pd.concat(
            [
                read_csv_file(sample, usecols=["smiles", "size"])
                for sample in [samples, known_smiles, invalid_smiles]
            ]
        )
    else:
        data = samples

skinniderlab / CLM

Consider refactoring function #174