scverse / scanpy

Single-cell analysis in Python. Scales to >1M cells.
https://scanpy.readthedocs.io
BSD 3-Clause "New" or "Revised" License
1.92k stars 599 forks source link

Adding MAGIC to Scanpy #187

Closed scottgigante closed 6 years ago

scottgigante commented 6 years ago

Hi,

We spoke a while ago about adding PHATE and eventually MAGIC to Scanpy. MAGIC has just been submitted to CRAN and is in a stable state.

How should a tool such as MAGIC interact with Scanpy? Do you currently have any imputation methods included in the package that I can use to model the API?

Thanks, Scott

falexwolf commented 6 years ago

Hi Scott,

sure, I remember! :smile: For some reason, I forgot to mention you personally in the release notes, is now fixed. Sorry about that!

You could add MAGIC as a preprocessing similar to DCA in the imputation section: http://scanpy.readthedocs.io/en/latest/api/index.html#preprocessing-pp.

In terms of code, I would also adapt the conventions of DCA: https://github.com/theislab/scanpy/blob/master/scanpy/preprocessing/dca.py

We had some discussions on how to do this best: https://github.com/theislab/scanpy/issues/142 and https://github.com/theislab/scanpy/pull/186.

If you think you have better conventions, happy to adopt these. DCA is also not yet released...

Best, Alex

wangjiawen2013 commented 6 years ago

@falexwolf MAGIC uses root square transformation, not the frequently used log transformation, which causes the incompatibility with batch correction methods, such as CCA and MNN. Is DCA compatible with MNN and CCA ?

scottgigante commented 6 years ago

@wangjiawen2013 we recommend using square root transform with MAGIC but it's certainly not incompatible. So long as the inputs have been library size normalized and transformed with any of log, sqrt, arcsinh or some other sublinear transformation, MAGIC will work just fine.

scottgigante commented 6 years ago

Also I'm surprised to see I never left a note on your message @falexwolf : thanks! I'm working on the API now, will send in a PR when it's done or leave a note here if I think the DCA API could do with some modification.

wangjiawen2013 commented 6 years ago

I find some negative values in the imputation data, how did them generate ?

scottgigante commented 6 years ago

@wangjiawen2013 if you have any questions about MAGIC, I recommend you post them in the MAGIC repo. The negative values are an artifact of the imputation process, but the absolute values of expression are not really important, since normalized scRNAseq data is only really a measure of relative expression anyway.