gustaveroussy / sopa

Technology-invariant pipeline for spatial omics analysis that scales to millions of cells (Xenium / Visium HD / MERSCOPE / CosMx / PhenoCycler / MACSima / etc)
https://gustaveroussy.github.io/sopa/
BSD 3-Clause "New" or "Revised" License
143 stars 18 forks source link

[Feature] Multiple segmentation rounds of stains or cells + nuclei #8

Closed pakiessling closed 10 months ago

pakiessling commented 11 months ago

Thank you for the tool. It looks really awesome!

What I currently use VPT for is multiple rounds of segmentation of different protein stains that are then "harmonized". In addition, I separately segment nuclei and cell membrane into two sets of polygons that are matched with each other child <> parent.

I don't think that's currently supported in Sopa?

quentinblampey commented 11 months ago

Hello @pakiessling, I'm glad you are interested in Sopa!

This feature is pretty easy to have with the CLI only, but it may require some time to know about the different commands of the CLI. I'm thinking about implementing one of these two solutions:

What would be your preferred option? Also, have you tried using Baysor? It scales very well with Sopa, and gives much better results on MERSCOPE data compared to the Vizgen segmentation (see our preprint here: https://www.biorxiv.org/content/10.1101/2023.12.22.571863v1). Actually, I recommend Baysor for Spatial Transcriptomics, and Cellpose for Multiplex Imaging technologies.

pakiessling commented 11 months ago

Hi @quentinblampey , I would probably deploy Sopa inside a customized Snake make pipeline, so CLI would be fine.

It is awesome that Sopa can do segmentation harmonization and separate nuclei + membrane segmentation already, it would be lovely to substitute VPT which has been somewhat buggy for me.

I extensively tried Baysor, unfortunately it does not perform well on my tissue as there are big size differences in cell diameter and cells are elongated, which throws Baysor off.

If I recall correctly Sopa supports the implementation of additional segmentation methods? I guess I would just have to write a function that returns a mask for a patch?

quentinblampey commented 11 months ago

Great, then I think I will simply add a short tutorial to explain how to make it work! I'll let you know when it's added, probably in early January

Okay I see, indeed, Baysor may not be always the best choice.

Yes, you can add your own segmentation method. Currently, we support only staining-based methods (no transcript). As you said, the custom method returns a mask for a patch (see doc for "method" here), and you can use the sopa segmentation generic-staining command of the CLI Again, I'm happy to hear feedbacks from you, it will help be improve the documentation and maybe add a tutorial for the custom segmentation if needed!

quentinblampey commented 10 months ago

Hello @pakiessling,

I released sopa==1.0.1, on which I improved the CLI to ease the multiple rounds of segmentation. Everything is detailed in this tutorial. You can also check the new "CLI usage" tutorial, I think it will help you understand how to use the CLI overall. Let me know if everything works for you, and if it's clear enough! Btw, I recommend to first test it on the toy dataset, as explained in the tuto

Concerning the custom segmentation, it is now detailed in this tutorial!

pakiessling commented 10 months ago

Wow, thanks that must have been a lot of work. I will try it out ASAP.

One thing I didnt see: Let's say I segment both nuclei and cells with Cellpose. Is there a way I can have matched nuclei and cell segmentation for every cell? I.E. to know what transcripts are located in the nucleus and the cytoplasm.

quentinblampey commented 10 months ago

It should be possible using the sopa aggregate command of the CLI (https://gustaveroussy.github.io/sopa/cli/#sopa-aggregate) if you run it twice: once to count the transcripts inside the nucleus, and once on the entire cell Then, you can look at which transcripts are in common between the two aggregations!

So it's possible, but not really straightforward... Let me know if you make it work! Else I will maybe try myself (but don't know when)

quentinblampey commented 10 months ago

Hello @pakiessling, I'm closing this issue because it was originally concerning the multiple segmentation rounds

Yet, I still need to answer your question about counting transcripts in the nucleus and in the cytoplasm, so I have opened a separate issue here: #18! I will work on this at some point, but I have still not found the most efficient/natural way to do it