almaan / stereoscope

Spatial mapping of cell types by integration of transcriptomics data
MIT License
87 stars 26 forks source link

H5AD Sparse Matrix Support #8

Closed nik1sto closed 4 years ago

nik1sto commented 4 years ago

Hi Alma,

here I am again!

It would be convenient if you could add the functionality to automatically convert sparse matrices from h5ad files to dense ones, since processing them in sparse format doesn't work.

Best regards, Nik

almaan commented 4 years ago

Hello Nik,

Well spotted - I'm on that! I'll let you know when the implementation is in place!

Best Alma

almaan commented 4 years ago

Hi Nik,

I've pushed some updates to the master branch, which should be able to handle any h5ad file following the structure of a proper AnnDataobject. I've tested the feature using a couple of different datasets which all seem to work fine (including both dense and sparse count matrices). If this is true for you as well, feel free to close the issue, if not it would be really great if you could post the error message and I'll try to figure it out from there.

Thanks again, best Alma

nik1sto commented 4 years ago

Yes, works now! Thank you!

Is it intended that all the deconvolution output is saved in visium now? And not depending on the input file name anymore.

almaan commented 4 years ago

Great to hear!

I'm not quite sure what you mean by "saved in visium", but as of now the proportion estimates are still saved as tsv files. These should however be easy to integrate into the already existing h5ad objects if that is desireable (storing them in the obsm property for example).

The rationale for this is: (i) I don't want to modify the users existing datafiles (provided as input) and (ii) nor would I like to generate multiple copies of large files assuming that a copy of the input with the estimates being included were to be generated as an output. However, I might add alt. (i) as a feature in coming updates, but need to think a bit more about it.

Thanks for all the great feedback!

nik1sto commented 4 years ago

Thank you for your fast responses :) My bad, I had only seen that the tsv is saved in differently named folders than before, but that was due to the design of our pipeline. Directly having it in the h5ad also sounds like a convenient feature!