carmonalab / ProjecTILs

Interpretation of cell states using reference single-cell maps
GNU General Public License v3.0
246 stars 28 forks source link

Question regarding min.conf threshold #76

Closed gsoriaa closed 2 months ago

gsoriaa commented 6 months ago

Hello,

First of all, congrats on the great tool and thanks a lot for the really well-detailed vignettes.

I'm working with some T-cell data and I wondered why min.conf is set to 0.2 by default, as it might seem to be quite low at first. I have tried to find some advices myself in vignettes/raw code about considerations when re-defining this parameter, but I couldn't.

I have tried to use different thresholds but i would like to be as sure as possible with my annotation. I tried min.conf=[0.2:0.5], and found out that with a min.conf=0.5, I lose almost half of my Tcells that couldnt be propperly annotated (annotated as NAs):

Is min.conf = 0.2 already strict enough? Is min.conf=0.5 too restrictive? Is it an issue with my data?

PS: I have set the filter.cells = FALSE as i have already filtered and subsetted my full assay to CD4 Tcells, I don't know if it might make a point here.

mass-a commented 5 months ago

Hello! if you are unsure about the right confidence threshold, I would recommend running the classifier with min.conf=0. That will allow all cells to be labeled, while returning their confidence score in the metadata column functional.cluster.conf. You will then be able to analyze the distribution of these confidence scores, e.g. to see what kinds of cells receive low scores and whether they are associated with specific cell types. That may give you some hints on where to place the threshold on the confidence score. Does that make sense?