KPMP / Cell-State-Atlas-2022

Code for a multimodal atlas of healthy and injured cell states and niches in the human kidney
MIT License
14 stars 5 forks source link

snCv3 scCv3 : mapping & data integration question #3

Open atul-sethi opened 9 months ago

atul-sethi commented 9 months ago


Thanks for sharing the code as well as data from the kidney atlas paper for reuse and exploration. I have been trying to map single cell data onto single nucleus data for integration, UMAP projection, and annotation. I am using the data from paper itself: GSE183277 as reference data (snCv3) GSE183276 as query data (scCv3)

From SourceByTechnology/snCv3_scCv3_SNARE2/, I am using the codes snCv3_Clustering_Annotation.R for running pagoda on reference data counts, and snCv3-scCv3_Data_Integration.R for mapping of scCv3 data onto snCv3 and annotation.

The mapping is not working as expected. Following is the umap plot (query vs reference split) after I run the analyses: plot2

What I expected to see was the following plot (created from snCv3 & scCv3 seurat object you shared on GEO) plot1

Further, the annotation (class / subclass.l1 .... ) I get for the for query data (scCv3 counts) is also substantially different from the scCv3 seurat object you shared on GEO.

Given that I am able to get fairly similar umap for reference data (snCv3) by running pagoda on the count data myself and using its principal components to recompute UMAP, I think the issue might be on the query (scCv3) side.

Is the "Premiere_LD_RawCounts.RDS" data that you read in snCv3-scCv3_Data_Integration.R different from count tables you provide in GSE183276 ?

winfrees commented 6 months ago

Thank you for your interest in the data and code from our atlas paper. I would also like to apologize for the delay in response. The best person to address this question is the author Blue Lake (@b1lake). I've tagged him here. I will follow up in a week to see if he has responded. Thanks.