broadinstitute / pooled-cell-painting-profiling-recipe

:woman_cook: Recipe repository for image-based profiling of Pooled Cell Painting experiments
BSD 3-Clause "New" or "Revised" License
6 stars 4 forks source link

8_image-and-segmentation-qc #23

Closed ErinWeisbart closed 4 years ago

ErinWeisbart commented 4 years ago

This is a first draft. I need help on a couple things:

gwaybio commented 4 years ago

this looks great - thanks for opening the draft PR! One quick question is: Are we definitely settling on seaborn to perform these visualization? I know we used R in the original repo, and we should probably avoid it here to reduce complexity...

A very good alternative that can visualize this rather easily is plotnine. Below are some examples:

import plotnine as gg

by_well_gg = (
    gg.ggplot(cell_count_totalcells_df, gg.aes(x="x_loc", y="y_loc")) +
    gg.geom_point(gg.aes(fill="total_cell_count"), size=10) +
    gg.geom_text(gg.aes(label="Site"), color="lightgrey") +
    gg.facet_wrap("~Well") +
    gg.coord_fixed() +
    gg.theme_bw() +
    gg.theme(axis_text=gg.element_blank(), 
             axis_title=gg.element_blank(),
             strip_background=gg.element_rect(colour="black", fill="#fdfff4")) +
    gg.scale_fill_cmap(name="magma")
)

by_well_gg.save("insert_something_here.png", dpi=300, verbose=False)
by_well_gg

insert_something_here

by_quality_gg = (
    gg.ggplot(cell_count_df, gg.aes(x="x_loc", y="y_loc")) +
    gg.geom_point(gg.aes(fill="cell_count"), size=10) +
    gg.geom_text(gg.aes(label="Site"), color="lightgrey") +
    gg.facet_grid("Cell_Quality~Well") +
    gg.coord_fixed() +
    gg.theme_bw() +
    gg.theme(axis_text=gg.element_blank(), 
             axis_title=gg.element_blank(),
             strip_background=gg.element_rect(colour="black", fill="#fdfff4")) +
    gg.scale_fill_cmap(name="magma")
)

by_quality_gg.save("insert_something_else.png", dpi=300, verbose=False)
by_quality_gg

insert_something_else

If we want to go with seaborn, I will need to do a bit more digging

ErinWeisbart commented 4 years ago

I have no experience in working with Plotnine so I defaulted to Seaborn when changing things from R to Python (since it's Beth's favorite and I have some experience with it). We can certainly use Plotnine if it's your preference. Thanks for those examples!

gwaybio commented 4 years ago

sounds good - I like seaborn too, but I find that it gets pretty clunky for complicated custom plots.

A couple of other initial notes:

ErinWeisbart commented 4 years ago

There are a few glitches here that I'd like your help with, but I'm pretty dang proud of where I got it to! I'm sure some of my Dataframe manipulation is uuuuugly, so feel free to suggest cleaner ways to do things. But, it does work, so don't feel like you need to spend your time on functional but ugly things right now :)

The biggest problems are that I can't seem to change the aspect ratio of a few of the graphs so that they are legible. These include: ratios_per_well, PLLS, CP_saturation, and BC_saturation.

Additionally - the PLLS graph should have a separate scale bar for each row, and the BC_saturation wrap needs to be adjusted (or the row order needs to be adjusted).

ErinWeisbart commented 4 years ago

FYI, the two reverts are because I accidentally overwrote 7. instead of saving my latest changes as 8.

ErinWeisbart commented 4 years ago

@gwaygenomics Thanks for the code review! I've made changes to address what I can and I need your help for the rest. The way I see it, remaining changes fall into two categories: 1) Changes necessary to make this functional:

2) Changes that will make this better but can be implemented later:

As your push right now is to get the recipe to a 0.1 version, I'd appreciate it if you could help me with category 1 and then we can add the QC yaml file to the backlog for when you/we have time later? (I'd love to follow along as you make the QC yaml file. And if you have time to do it now that's great by me, I'm just assuming you're pretty busy with other things right now.)

gwaybio commented 4 years ago

I'd appreciate it if you could help me with category 1 and then we can add the QC yaml file to the backlog for when you/we have time later

Sounds like a good plan 👍 I will do this tomorrow, which will help me to build moment to push towards 0.1. Can you add the QC yaml as a new issue (if it's not already!)

My strategy to accomplish this will be similar to what I did in https://github.com/broadinstitute/pooled-cell-painting-profiling-recipe/pull/23#issuecomment-656336233 does that sound ok? Might be faster than hopping on a call is my thought

I'd love to follow along as you make the QC yaml file

For sure! No way I can do this step without you :)

ErinWeisbart commented 4 years ago

Happy to add the QC yaml as an issue. Category 1 changes should be pretty minor changes (the graphs are all built, they're just not as legible as I'd like them) so yes, leaving suggested changes in comment is great! It will (or should be) easy for me to understand what you've done without you needing to walk me through step by step.

ErinWeisbart commented 4 years ago

@gwaygenomics Dude, thanks for your help! Latest commit adds in all your suuuuper helpful changes. Thanks! I think this wraps up everything necessary, though I left a comment for you about changing ratio_gg row labels that would be prettier if you can help with that.