ohsu-comp-bio / tcrseq_normalization

0 stars 0 forks source link

count.spikes.R QC Check #20

Closed weshorton closed 8 years ago

weshorton commented 8 years ago

Summary

We're seeing a fair number of spikes in samples that should not have any spikes (H2O only samples as well as samples with no spikes added). Are these spikes being mis-identified by our spike-finding script count.spikes.R? Are they contamination?

Work Plan

  1. Obtain TCR seq fastq files that do not come from our lab and do not have spikes in them
  2. Run 9-bp and 25-bp spike finder on fastqs
  3. Determine extent of false spike identification

    Results

Used files from Kami in the Paul Spellman group (/home/exacloud/lustre1/CompBio/data/tcrseq/misc/Kami_test_files/)

9-bp

Sample Total Reads Spiked Reads Pct Spiked Reads
CO506alpha10.fastq 1846127 1211 0.065596787
CO506CO477beta10.fastq 429811 448 0.10423186
CO506CO509beta10.fastq 489204 178 0.036385639
CO506CO510beta10.fastq 558680 367 0.065690556
CO506CO511beta10.fastq 331910 214 0.06447531

25-bp

Sample Total Reads Spiked Reads Pct Spiked Reads
CO506alpha10.fastq 1846127 0 0
CO506CO477beta10.fastq 429811 0 0
CO506CO509beta10.fastq 489204 0 0
CO506CO510beta10.fastq 558680 0 0
CO506CO511beta10.fastq 331910 0 0
weshorton commented 8 years ago

Doesn't seem like our script is adding spikes where there aren't any. Still need to look into other sources of faulty spikes. See Issue #21