Closed ErinWeisbart closed 4 years ago
this looks great - thanks for opening the draft PR! One quick question is: Are we definitely settling on seaborn to perform these visualization? I know we used R in the original repo, and we should probably avoid it here to reduce complexity...
A very good alternative that can visualize this rather easily is plotnine. Below are some examples:
import plotnine as gg
by_well_gg = (
gg.ggplot(cell_count_totalcells_df, gg.aes(x="x_loc", y="y_loc")) +
gg.geom_point(gg.aes(fill="total_cell_count"), size=10) +
gg.geom_text(gg.aes(label="Site"), color="lightgrey") +
gg.facet_wrap("~Well") +
gg.coord_fixed() +
gg.theme_bw() +
gg.theme(axis_text=gg.element_blank(),
axis_title=gg.element_blank(),
strip_background=gg.element_rect(colour="black", fill="#fdfff4")) +
gg.scale_fill_cmap(name="magma")
)
by_well_gg.save("insert_something_here.png", dpi=300, verbose=False)
by_well_gg
by_quality_gg = (
gg.ggplot(cell_count_df, gg.aes(x="x_loc", y="y_loc")) +
gg.geom_point(gg.aes(fill="cell_count"), size=10) +
gg.geom_text(gg.aes(label="Site"), color="lightgrey") +
gg.facet_grid("Cell_Quality~Well") +
gg.coord_fixed() +
gg.theme_bw() +
gg.theme(axis_text=gg.element_blank(),
axis_title=gg.element_blank(),
strip_background=gg.element_rect(colour="black", fill="#fdfff4")) +
gg.scale_fill_cmap(name="magma")
)
by_quality_gg.save("insert_something_else.png", dpi=300, verbose=False)
by_quality_gg
If we want to go with seaborn, I will need to do a bit more digging
I have no experience in working with Plotnine so I defaulted to Seaborn when changing things from R to Python (since it's Beth's favorite and I have some experience with it). We can certainly use Plotnine if it's your preference. Thanks for those examples!
sounds good - I like seaborn too, but I find that it gets pretty clunky for complicated custom plots.
A couple of other initial notes:
pd.categorical
)There are a few glitches here that I'd like your help with, but I'm pretty dang proud of where I got it to! I'm sure some of my Dataframe manipulation is uuuuugly, so feel free to suggest cleaner ways to do things. But, it does work, so don't feel like you need to spend your time on functional but ugly things right now :)
The biggest problems are that I can't seem to change the aspect ratio of a few of the graphs so that they are legible. These include: ratios_per_well, PLLS, CP_saturation, and BC_saturation.
Additionally - the PLLS graph should have a separate scale bar for each row, and the BC_saturation wrap needs to be adjusted (or the row order needs to be adjusted).
FYI, the two reverts are because I accidentally overwrote 7. instead of saving my latest changes as 8.
@gwaygenomics Thanks for the code review! I've made changes to address what I can and I need your help for the rest. The way I see it, remaining changes fall into two categories: 1) Changes necessary to make this functional:
2) Changes that will make this better but can be implemented later:
As your push right now is to get the recipe to a 0.1 version, I'd appreciate it if you could help me with category 1 and then we can add the QC yaml file to the backlog for when you/we have time later? (I'd love to follow along as you make the QC yaml file. And if you have time to do it now that's great by me, I'm just assuming you're pretty busy with other things right now.)
I'd appreciate it if you could help me with category 1 and then we can add the QC yaml file to the backlog for when you/we have time later
Sounds like a good plan 👍 I will do this tomorrow, which will help me to build moment to push towards 0.1. Can you add the QC yaml as a new issue (if it's not already!)
My strategy to accomplish this will be similar to what I did in https://github.com/broadinstitute/pooled-cell-painting-profiling-recipe/pull/23#issuecomment-656336233 does that sound ok? Might be faster than hopping on a call is my thought
I'd love to follow along as you make the QC yaml file
For sure! No way I can do this step without you :)
Happy to add the QC yaml as an issue. Category 1 changes should be pretty minor changes (the graphs are all built, they're just not as legible as I'd like them) so yes, leaving suggested changes in comment is great! It will (or should be) easy for me to understand what you've done without you needing to walk me through step by step.
@gwaygenomics Dude, thanks for your help! Latest commit adds in all your suuuuper helpful changes. Thanks! I think this wraps up everything necessary, though I left a comment for you about changing ratio_gg row labels that would be prettier if you can help with that.
This is a first draft. I need help on a couple things:
# Plot total number of cells per well
# Plot total number of cells per well
to output graphs per well though I realize you plotted them in the same graph in the original version, which I can fix after getting help with the following:# Plot cell number by category per well
generating the 12 separate single_quality_df that we want, I just don't know how to distribute those 12 onto the 12 subplots.