broadinstitute / pooled-cell-painting-profiling-recipe

:woman_cook: Recipe repository for image-based profiling of Pooled Cell Painting experiments
BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

Error in 4.image-and-segmentation-qc.py #73

Open gwaybio opened 3 years ago

gwaybio commented 3 years ago

@ErinWeisbart - I am trying to rerun this step in the recent pooled dataset. It was working smoothly until line 471. I paste the error statement at the end of this issue (file paths intentionally obscured).

If you look at the "blame" line 471 is my doing. However, in #72 you modified how cp_sat_df is constructed - which likely changed how it should be processed downstream. ("blame" is a bad technical term.... but it is at least descriptive!)

Do you know what's going on? maybe this is an easy fix 🤷

XXX/recipe/scripts/io_utils.py:9: UserWarning: data/0.site-qc/XXX/figures/plate_layout_cells_count_per_well.png exists, overwriting
XXX/recipe/scripts/io_utils.py:9: UserWarning: data/0.site-qc/XXX/figures/plate_layout_ratios_per_well.png exists, overwriting
XXX/recipe/scripts/io_utils.py:9: UserWarning: data/0.site-qc/XXX/figures/plate_layout_Cells_FinalThreshold_per_well.png exists, overwriting
XXX/recipe/scripts/io_utils.py:9: UserWarning: data/0.site-qc/XXX/figures/plate_layout_Nuclei_FinalThreshold_per_well.png exists, overwriting
XXX/recipe/scripts/io_utils.py:9: UserWarning: data/0.site-qc/XXX/figures/plate_layout_PercentConfluent_per_well.png exists, overwriting
XXX/recipe/scripts/io_utils.py:9: UserWarning: data/0.site-qc/XXX/results/sites_with_confluent_regions.csv exists, overwriting
/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/plotnine/layer.py:401: PlotnineWarning: geom_text : Removed 720 rows containing missing values.
Traceback (most recent call last):
  File "/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3080, in get_loc
    return self._engine.get_loc(casted_key)
  File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc
  File "pandas/_libs/hashtable_class_helper.pxi", line 4554, in pandas._libs.hashtable.PyObjectHashTable.get_item
  File "pandas/_libs/hashtable_class_helper.pxi", line 4562, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'level_3'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "recipe/0.preprocess-sites/4.image-and-segmentation-qc.py", line 471, in <module>
    cp_sat_df[["cat", "type", "Ch"]] = cp_sat_df["level_3"].str.split(
  File "/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/core/frame.py", line 3024, in __getitem__
    indexer = self.columns.get_loc(key)
  File "/home/ubuntu/miniconda3/envs/pooled-cell-painting/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3082, in get_loc
    raise KeyError(key) from err
KeyError: 'level_3'
ErinWeisbart commented 3 years ago

I don't get any error, but maybe this info helps in debug:

For my testing I haven't been importing the config but instead just setting the few variables actually needed for this section. What I set is:

cell_count_file = '/Users/eweisbar/Desktop/XX/cell_count.tsv'
input_image_file = '/Users/eweisbar/Desktop/XX/image_metadata.tsv'

sites_per_image_grid_side = 10
cell_filter = ["Perfect", "Great"]
cell_category_order = ["Bad","Empty","Great","Imperfect","Perfect"]
image_cols = {'well': "Metadata_Well", 'site': "Metadata_Site", 'plate': "Metadata_Plate"}
barcoding_cycles=9
barcoding_prefix='CorrCycle'
painting_image_names = [1,2,3,4,5] #only used for length here since I pull actual names from column lists now

output_figuresdir = '/Users/eweisbar/Desktop/XX/'
output_resultsdir = '/Users/eweisbar/Desktop/XX/'

Going into the if statement at 465 I have:

cp_sat_df_cols = ['Metadata_Well',
 'Metadata_Site',
 'Metadata_Plate',
 'ImageQuality_PercentMaximal_CorrDNA',
 'ImageQuality_PercentMaximal_CorrER',
 'ImageQuality_PercentMaximal_CorrMito',
 'ImageQuality_PercentMaximal_CorrPhalloidin',
 'ImageQuality_PercentMaximal_CorrWGA',
 'ImageQuality_StdIntensity_CorrDNA',
 'ImageQuality_StdIntensity_CorrER',
 'ImageQuality_StdIntensity_CorrMito',
 'ImageQuality_StdIntensity_CorrPhalloidin',
 'ImageQuality_StdIntensity_CorrWGA']

cp_sat_df looks like this after line 468:

Screen Shot 2021-06-07 at 2 45 50 PM
gwaybio commented 3 years ago

weird, I wonder what's going on. I am not going to have time to dig into this until after our meeting Friday.

Also, I just now realized that I only have one QC plate figure whereas I probably should have one per plate. Is that true? If so, can you open a new issue to track that? (I don't want to get this issue confused with a separate one!)