AlexsLemonade / alsf-scpca

Management and analysis tools for ALSF Single-cell Pediatric Cancer Atlas data.
BSD 3-Clause "New" or "Revised" License
0 stars 1 forks source link

Add linear regression analysis of mean gene expression across tools #124

Closed allyhawkins closed 3 years ago

allyhawkins commented 3 years ago

Closes #123. This PR adds a small section to the notebook looking at gene comparisons across tools. The comment in https://github.com/AlexsLemonade/alsf-scpca/pull/116#pullrequestreview-718525671 suggests that we should not only look at genes that are unique to each tool, but also look at how similar gene expression is across the tools. To do this, I have added in a linear regression analysis between cellranger and alevin-fry cr-like and cr-like-em and examined the residuals for each of the genes.

If the tools are over or underestimating gene expression then the residuals will reflect that. I plotted the residuals for each gene in each sample in comparison to the mean gene expression and then labeled the genes that had an abs(residual) > 1 to identify genes that may be consistently over and under represented in alevin-fry in compared to cellranger. When doing this, I found that most of the genes were coming from ribosomal genes. I also found that there were much fewer outliers in the single-nuclei samples than the single-cell samples interestingly.

I believe I have addressed the feedback in https://github.com/AlexsLemonade/alsf-scpca/pull/116#pullrequestreview-718525671, but please let me know if there were other comparisons that you had in mind.