cytomining / pycytominer

Python package for processing image-based profiling data
https://pycytominer.readthedocs.io
BSD 3-Clause "New" or "Revised" License
76 stars 34 forks source link

Infer_cp_features fails to find some features (in select cases) #122

Open gwaybio opened 3 years ago

gwaybio commented 3 years ago

We should be able to use infer_cp_features() to identify patterns beyond compartment. We might want to use infer_cp_features() to subset by CellProfiler feature group, e.g. Cells_RadialDistribution. However, this currently fails because we check for and correct capitalization. We should not be so restrictive and so protective of our users! We should permit the input of compartments to be formatted as is.

gwaybio commented 3 years ago

error for infer_cp_features(df, compartments="Cytoplasm_RadialDistribution"):

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-36-1b1013d883bb> in <module>
      4 for cell_line in df.Metadata_cell_line.unique():
      5     for compartment in feature_group_compartments:
----> 6         compartment_features = infer_cp_features(df, compartments=compartment)
      7         subset_df = df.loc[:, meta_features + compartment_features]
      8 

~/miniconda3/envs/grit-benchmark/lib/python3.9/site-packages/pycytominer/cyto_utils/features.py in infer_cp_features(population_df, compartments, metadata)
     86         ].tolist()
     87 
---> 88     assert (
     89         len(features) > 0
     90     ), "No CP features found. Are you sure this dataframe is from CellProfiler?"

AssertionError: No CP features found. Are you sure this dataframe is from CellProfiler?

infer_cp_features(df, compartments="Cytoplasm_Radial") does not throw an error. The error comes from the capitalized "D" in "Distribution".