rnabioco / djvdj

An R package to analyze single-cell V(D)J data
https://rnabioco.github.io/djvdj
Other
24 stars 4 forks source link

ROC analysis of CITE-seq / AVID-seq reagents #90

Open jayhesselberth opened 2 years ago

jayhesselberth commented 2 years ago

Thoughts on ROC analysis of protein-DNA tags as classifiers.

The question is how well a given reagent performs as a classifer relative to gene expression classifications (i.e., assuming these are the "gold standards"). AUC values could provide information about reagent quality and can be compared across reagents, batches, etc.

For a function roc_analysis(), Input data would be so or sce with:

  1. Cell type classifications based on gene expression (e.g. based on clustifyr)
  2. Raw or normalized counts of protein-DNA tags (CITE-seq antibodies, AVID-tags, antigen-DNA tags, etc)

For a comparison, assume two possible states (e.g., B vs T cell, or B cell vs all other cells). Then step through the range of recovered protein-DNA tag signal and calculate:

  1. True positive rate (TP / (TP + FN)). TP = number of B cells scoring positive, FN = number of B cells scoring negative.
  2. False positive rate (FP / FP + TN). FP = number of T cells scoring positive, TN = number of T cells scoring negative.

plot_roc() would plot TPR vs FPR for each of the ranked detection values, and roc_auc() would provide the AUC value from the data.

cc @catherinenicholas

jayhesselberth commented 2 years ago

https://rviews.rstudio.com/2019/03/01/some-r-packages-for-roc-curves/

Maybe build off of pROC or ROCR. Base R plots make me sad

jayhesselberth commented 2 years ago

https://github.com/dariyasydykova/tidyroc/