Hoohm / CITE-seq-Count

A tool that allows to get UMI counts from a single cell protein assay
https://hoohm.github.io/CITE-seq-Count/
MIT License
79 stars 44 forks source link

Proper Saturation Rate Calculation #160

Open brindavjk opened 3 years ago

brindavjk commented 3 years ago

Hi, I wanted to confirm that we are calculating the saturation rate correctly for our CITE-seq data. We have a run_report.yaml output as follows: Date: 2020-10-29 Running time: 4.0 hours, 43.0 minutes, 25.25 seconds CITE-seq-Count Version: 1.4.3 Reads processed: 25782377 Percentage mapped: 30 Percentage unmapped: 70 Uncorrected cells: 2 Correction: Cell barcodes collapsing threshold: 1 Cell barcodes corrected: 401137 UMI collapsing threshold: 2 UMIs corrected: 2118655 Run parameters: Read1_paths: S1_FBC_S2_L001_R1_001.fastq.gz Read2_paths: S1_FBC_S2_L001_R2_001.fastq.gz Cell barcode: First position: 1 Last position: 16 UMI barcode: First position: 17 Last position: 28 Expected cells: 50000 Tags max errors: 2 Start trim: 0

So would our saturation rate be (1-umis corrected / reads processed) = 1 - 2118655/25782377 = 91.7%? Just want to make sure I did that properly.

Thank you!

Hoohm commented 3 years ago

Hey @brindavjk that wouldn't be what you're looking for. The saturation rate is answering how many more molecules (UMI) do I get if I sequence deeper. You should calculate total_umis/total_reads*100

The UMIS corrected here is how many UMIs have been corrected in your data.