scverse / scanpy-tutorials

Scanpy Tutorials.
https://scanpy-tutorials.readthedocs.io/
189 stars 117 forks source link

An example notebook for plotting with marsilea #97

Closed Mr-Milk closed 7 months ago

Mr-Milk commented 8 months ago

This PR is for scverse/scanpy#2444

Thanks for your interest in Marsilea. I'm not sure if my explanation in the notebook is simple enough. Feel free to modify and comment.

review-notebook-app[bot] commented 8 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

ivirshup commented 7 months ago

cc: @grst @flying-sheep

ivirshup commented 7 months ago

Thanks, overall looks very good. I like the slow building up of the plot. Made a few comments on possible simplifications/ deprecations to consider.

I've also opened up a sibling PR into scanpy so we can preview this:

For the example that's built up, what do you think about doing that with an aggregated plot? With the full cell heat map, it takes up so much space that I can't see both the code and the plot at the same time. And, can't even see the whole plot sometimes:

Untitled

But the aggregated plots are much more dense:

Untitled 2
Mr-Milk commented 7 months ago

I can play around with the size to see if I can make the heatmap denser. Otherwise, I will change it to the aggregated example.

ivirshup commented 7 months ago

This may also be more of an API thing, but in the dotplot the sizes seem to provided as counts, but the legend says they are fractions. Should: size = counts / cell_counts? Or is some form of normalization implicit? It's unclear to me how the plot would know what to normalize by.

Mr-Milk commented 7 months ago

This may also be more of an API thing, but in the dotplot the sizes seem to provided as counts, but the legend says they are fractions. Should: size = counts / cell_counts? Or is some form of normalization implicit? It's unclear to me how the plot would know what to normalize by.

Yes, it should be divided by cell counts. I was just copying the legend title of Scanpy's dot plot. There is no normalization in the plot, it simply just maps the data values to the size range of the circle marker. I will also change this part in the notebook.

ivirshup commented 7 months ago

I would also suggest:

agg = sc.get.aggregate(pbmc[:, markers], by="louvain", func=["mean", "count_nonzero"])
...
m = ma.Heatmap(
    agg.to_df(layer="mean"),  # Or agg.layers["mean"]
    height=h / 3,
    width=w / 3,
    cmap="Blues",
    linewidth=0.5,
    linecolor=".8",
    label="Expression",
)
...
m = ma.SizedHeatmap(
    size=agg.to_df(layer="count_nonzero") / adata.obs["louvain"].value_counts(),
    color=agg.to_df(layer="mean"),
    cluster_data=agg.to_df(layer="count_nonzero"),
    height=h / 3,
    width=w / 3,
    edgecolor=".8",
    cmap="Blues",
    size_legend_kws=dict(
        colors="#538bbf",
        title="Fraction of cells\nin groups (%)",
        labels=["20%", "40%", "60%", "80%", "100%"],
        show_at=[0.2, 0.4, 0.6, 0.8, 1.0],
    ),
    color_legend_kws=dict(title="Mean expression\nin group"),
)

Which would remove some of the code needed above.

ivirshup commented 7 months ago

@Mr-Milk please let me know when I should take another look here! I've bumped the build over on scanpy

Mr-Milk commented 7 months ago

@ivirshup, I'm happy with the current notebook. Could you take a look?

For tracks plot and stacked violin, I can implement and include them in Marsilea in the next minor release.

ivirshup commented 7 months ago

Looks pretty good! I may make some minor changes, but otherwise good. What should your name look like in the release notes? "Y Zheng"?

Mr-Milk commented 7 months ago

Sure, that's good for me!

ivirshup commented 7 months ago

Merged via https://github.com/scverse/scanpy-tutorials/pull/98

Had to make a separate PR since I can't push to Mr-Milk:main