broadinstitute / pooled-cell-painting-profiling-recipe

:woman_cook: Recipe repository for image-based profiling of Pooled Cell Painting experiments
BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

Different overwrite warning behavior between steps #42

Closed ErinWeisbart closed 2 years ago

ErinWeisbart commented 4 years ago

When files already exist: Step 0./1.process-spots throws an error for every file

Now processing spots for 151B2-B1-2...
/Users/eweisbar/miniconda3/envs/pooled-cp/lib/python3.7/site-packages/ipykernel_launcher.py:51: UserWarning: Output files likely exist, now overwriting...
Now processing spots for 151B2-B1-5...
/Users/eweisbar/miniconda3/envs/pooled-cp/lib/python3.7/site-packages/ipykernel_launcher.py:51: UserWarning: Output files likely exist, now overwriting...
Now processing spots for 151B2-B1-4...
/Users/eweisbar/miniconda3/envs/pooled-cp/lib/python3.7/site-packages/ipykernel_launcher.py:51: UserWarning: Output files likely exist, now overwriting...

Step 0./2.process-cells throws one error at the beginning:

Now processing cells for 151B2-B1-2...
/Users/eweisbar/miniconda3/envs/pooled-cp/lib/python3.7/site-packages/ipykernel_launcher.py:122: UserWarning: Output files likely exist, now overwriting...
Now processing cells for 151B2-B1-5...
Now processing cells for 151B2-B1-4...
Now processing cells for 151B2-B1-3...

I think they should have the same behavior to prevent confusion. I prefer an error for every file.

ErinWeisbart commented 4 years ago

The plot thickens: cell_counts_151B2-B1-2.tsv doesn't parse site_full into plate, well and site (made in 0./2.process-cells) whereas the other sites do. I discovered this because all_cellpainting_cellquality_across_sites_by_well.png (made in 0./3.visualize-cell-summary) shows a third well (labeled nan) that contains only 151B2-B1-2. Looking at the cell_count_df in 0./3.visualize-cell-summary shows that only that one site failed to parse. cell_count_df[cell_count_df.isnull().any(axis=1)] gives all the rows where site_full = 151B2-B1-2

ErinWeisbart commented 4 years ago

If I purge all the data from the batch and run from 0./0. through then the parsing happens correctly.

ErinWeisbart commented 2 years ago

I'm not sure what was going on when I originally made this issue, but it looks like since the warning was added to the code it has happened on a per-site basis for 0./2.process-cells as well.

I don't think it's worth it to try to replicate whatever was going on with my "plot thickens" as this hasn't been a problem since. So I'm closing the issue.