hms-dbmi / dseqr

single-cell and bulk RNA-seq analyses from counts → pathways → drug candidates.
https://docs.dseqr.com
Other
20 stars 4 forks source link

Seurat integration can improve and deteriorate clustering #74

Closed alexvpickering closed 3 years ago

alexvpickering commented 5 years ago

Here is the original labels for the healthy and diseased SJIA lung samples but with TSNE coordinates after integration:

image

And here are the clusters I labelled after integration:

image

For the most part, the Seurat integrations seems to improve the clustering of the samples. For example, some diseased cells previously labelled as B-cells seem better labelled as RBCs (note that expression values are from pre-integration):

image

In contrast, I believe that the distinct Alveolar Epithelium clusters is an artefact of there being no Smooth Muscle cells in the healthy sample and the integration algorithm matching the healthy Alveolar Epithelium cluster with the diseased Smooth Muscle cluster. There are some top marker genes that share expression between healthy and control cells within this cluster:

image

There are other top marker genes that are very distinct between healthy and diseased cells in this Alveolar Epithelium cluster:

image

Additionally, if the Smooth Muscle cluster is excluded prior to integration, the Alveolar Epithlium cells do not form two distinct clusters after integration. Here is the full PDF for the combined reports:

lung_combined_markers.pdf

And for documentation, commits related to generating these reports include: