Open ggruenhagen3 opened 1 year ago
Hi @ggruenhagen3
Great to hear that cell2location helps.
Cell2location already offers similar functionality: https://cell2location.readthedocs.io/en/latest/notebooks/cell2location_tutorial.html#Estimate-cell-type-specific-expression-of-every-gene-in-the-spatial-data-(needed-for-NCEM)
I think it's impossible to say that a location has a discrete number of cells because the data comes from a 2D section - even if you see 2 nuclei it doesn't mean that RNA was isolated from 2 complete cells. This is reflected in cell2location estimates being a continuum of cell abundance/cell density. However, you can estimate the expected number of RNA counts coming from each cell type at every location for every gene (computed in the tutorial above). You can filter {cell_type location gene data} to exclude {cell_type * location} pairs where cell abundance is too small and RNA count is too small and then normalise expected counts by cell abundance (done in NCEM workflow).
I hope this makes sense.
@vitkl that makes sense, I'll give that link you sent a try. Thanks!
Hello,
I have a quick question regarding the extracted cell-type specific expression. Are these expected number of raw UMIs per cell-type in that spot? Would I have to normalize them for spot-specific library-size?
Are these expected number of raw UMIs per cell-type in that spot?
Yes but the way this is currently computed the numbers are not integers.
Would I have to normalize them for spot-specific library-size?
Yes, you can try using adata.uns['mod']['post_sample_means']['detection_y_s']
per spot technical RNA detection sensitivity estimate.
Hi,
Thank you for building this incredible tool! I especially like that cell2location estimates the number of cells from each cell type in a spot, not just the proportion. I think it would be really cool if the tool could also reconstruct a single cell expression matrix. In other words, if spot1 is predicted to have two cells from celltype A and one cell from celltype B, then the reconstructed single cell matrix for spot1 would have three cells with the gene expression profile constructed from spot1 and split according to cell type. This sort of thing is implemented by SpaTalk and I can kind of use the cell2location results in SpaTalk. But I think directly using the models built by cell2location would be more accurate than SpaTalk having to re-estimate things (or whatever method it uses for single cell matrix reconstruction).
Thanks, George