MICS-Lab / scyan

Biology-driven deep generative model for cell-type annotation in cytometry. Scyan is an interpretable model that also corrects batch-effect and can be used for debarcoding or population discovery.
https://mics-lab.github.io/scyan/
BSD 3-Clause "New" or "Revised" License
33 stars 1 forks source link

Spillover correction #29

Closed grst closed 5 months ago

grst commented 6 months ago

Scyan doesn't seem to do spillover correction on FCS import or at any later step, even if a spillover matrix is present in the FCS metadata. I was wondering, why this is not necessary/recommended in this case?

quentinblampey commented 6 months ago

Spillover correction is generally not needed, except for a very strong spillover.

I'm using fcsparser to parse the FCS, and it doesn't return the spillover matrix but returns all the FCS metadata. So it should be pretty easy to parse the spillover matrix and add it into varp. I'll work on this, it shouldn't be too long

I would be happy to have your opinion on what you think would be the best implementation:

grst commented 6 months ago

I think I would prefer an approach where the spillover matrix is saved in varp and and there's a separate function to apply spillover correction to the data matrix (potentially save it to a separate layer in AnnData).

In fact, for my data, I have already been doing that using pytometry and was more wondering if there could be any downsides with using compensated data for scyan.

More generally, I have been wondering if it would make sense to have a scverse core package for flow cytomstry that handles fundamental tasks like IO, compensation and normalization and then ecosystem packages like scyan, CytoPy or pytometry could depend on that instead of reimplementing everything from scratch. What do you think of that idea and would you potentially be interested in contributing/co-maintaining such a package? I could then bring it up in one of the scverse core team meetings.

quentinblampey commented 6 months ago

Okay perfect, I also prefer not using the spillover matrix during reading, and having a separate function to apply it.

No there shouldn't be any downside to using compensated data: if you have a good spillover correction, you can definitely use it and train Scyan afterwards.

Yes, this is really a good idea! Also, the fcsparser library is not maintained (same for FlowUtils and fcswrite, which are two other dependencies of scyan), so it would be really beneficial to have a well-maintained IO/preprocessing library inside scverse. Actually, I already discussed some months ago with the authors of Pytometry and CytoPy about improving the interoperability of our tools but we didn't really start doing it. I think this initiative would be a great first step towards a better integration of these tools 🙂 So, yes, I would be really interested in contributing to this!

quentinblampey commented 6 months ago

Concerning your initial question: do you still want to parse the spillover correction during FCS reading in scyan, or should we wait until the new IO package is developed?

grst commented 6 months ago

It's not a priority for me, as it worked quite well to load+compensate the data with pytometry and then switch to Scyan.

quentinblampey commented 6 months ago

Okay! Then I'll keep it as a medium priority (probably a few weeks), I'll keep you updated

quentinblampey commented 5 months ago

Sorry for the delay, I finally added this in scyan==1.6.1 scyan.read_fcs now adds the matrix under adata.varp["spillover_matrix"], and it can be applied with scyan.preprocess.compensate Let me know if you have any issue with this!

cstrlln commented 3 months ago

Just to confirm, the data from spectral flow cytometry has to be unmixed but not necessarily compensated?

We always do some manual correction and lately have been testing Autospill to help with this, it is integrated in flowjo but I believe the algorithm is free to use: https://github.com/carlosproca/autospill Could something like autospill be integrated in scyan or do you think it doesn't add much?

quentinblampey commented 3 months ago

Yes, indeed, unmixing is recommended. Compensation is not always needed, so you can start without it

Concerning autospill, I think it is not the scope of Scyan to have this included, since Scyan is not designed for pre-processing. Of course it can be used separately, but I have not tried this tool yet so I don't really know how valuable it is!