mkiyer / oncoseq

Automatically exported from code.google.com/p/oncoseq
0 stars 0 forks source link

files with variable length reads may not be handled correctly #7

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
Some of the TCGA files have quality trimmed reads. Some of the reads are 
extremely short, like 0 or 1bp. Tophat detects the read length using the first 
read in a fastq file. This is too precarious.

Original issue reported on code.google.com by matthew....@gmail.com on 10 Feb 2013 at 7:25

GoogleCodeExporter commented 8 years ago
to fix this we now parse the FASTQC report and determine the most common read 
length in the fastq file. we use this as the read length when tophat determines 
the insert size

Original comment by matthew....@gmail.com on 10 Feb 2013 at 7:25

GoogleCodeExporter commented 8 years ago

Original comment by matthew....@gmail.com on 10 Feb 2013 at 7:28