AlexsLemonade / sc-data-integration

0 stars 0 forks source link

Use marker genes from defined cell types as reference for CellAssign #223

Closed allyhawkins closed 1 year ago

allyhawkins commented 1 year ago

Stacked on #221

This PR is a continuation of the analysis started in #221. Here I broke out the comparison of using the top marker genes as a reference for cell assignment as a separate notebook. Before filing this, I addressed some of the concerns brought up in the initial review:

For the "circular" bit: how did you decide on 10 markers for each cell type? I also wonder if we might want to use "any" for these markers, or something in between: I kind of assume that most marker lists are some kind of a mix, where genes that mark a set of related types might be included in all of their marker lists. The fact that CellAssign uses a binary matrix certainly implies that it would expect some of that kind of data. (But does CellAssign require upregulation for a marker? I kind of assume so, but it is worth checking).

Here's a copy of the completed notebook: cell-assign-sarcoma-marker-genes.nb.html.zip

allyhawkins commented 1 year ago

Thanks for taking a look at this @jashapiro. I added some explanatory text around the analysis that was already here, but I didn't update any plots or any of the actual analysis. I will note that I did try to use any rather than all for the FindMarkers strategy and there was not a noticeable difference which is why I just kept using all. I do think the idea of doing a logFC cutoff is good, but I'm not sure we need it at this stage.