RGLab / FAUST

Full annotation using shape-constrained trees
GNU General Public License v3.0
26 stars 6 forks source link

Linking expression of additional markers to annotation clusters #3

Closed onecarbon closed 4 years ago

onecarbon commented 4 years ago

Thanks for this software!

I am hoping to use it to define cell annotations, then compare the expression of a marker (not used in annotation) across these annotation clusters. I am having trouble linking the per-cell FAUST annotation to markers that were not used in the annotation strategy. For instance levelExprs.rds in the levelData outputs only includes the markers used in annotation (and this appears to be identical to the exprsMat.rds in sampleData?).

Is there an easy way to pull this data from the FAUST outputs? Thanks!

evangreene commented 4 years ago

Hi,

There are two possible ways to do this. In either case, you should first look up the cluster annotation you'd like to study in the faustData/metaData/colNameMap.rds file. You can use this file to link phenotypes reported in faustData/faustCountMatrix.rds (stored in the newColNames column) to their internal representation (stored in the faustColNames column).

Once you have the internal representation you're interested in, you can find cells with that annotation in the set of files faustData/sampleData/[sampleName]/faustAnnotation.csv. Each row of this csv corresponds to the cell measured in the expression matrix faustData/sampleData/[sampleName]/exprsMat.rds. If the marker you wish to study is set as one of the activeChannels in the faust call, you can get the expression data sample-by-sample by linking the cells with the desired internal representation in faustAnnotation.csv to rows of the exprsMat.rds, and then extracting the relevant marker data.

On the other hand, if you did not set the marker you wish to study as one of the markers in the activeChannels vector, you can extract the marker data from directly the gating set for each sample, and then subset the extracted marker data to the rows of the faustAnnotation.csv containing the relevant internal faust representation from the colNameMap.rds.

Hope this helps. Please let me know if this does not resolve your issue.

Thanks, Evan

onecarbon commented 4 years ago

Thanks!

That is helpful. In the interim, I had run the analysis with and without the marker I was interested in in the activeChannels: then I could merge the two outputs to have the annotation (independent of the marker) and the marker expression for each cell.

I will try it again with your suggestion as I become more familiar with these data structure.