JaneliaSciComp / msg

Multiplexed Shotgun Genotyping
http://genomics.princeton.edu/AndolfattoLab/MSG.html
11 stars 12 forks source link

Remove poor quality reads from fastq file right away using TQSfastq.py #26

Closed gregpinero closed 12 years ago

gregpinero commented 12 years ago

Run the reads file through TQSfastq.py. Make this optionial and configurable.

Peter or David:

TQSfastq.py has these options, what do you think the default values should be?

Phred quality threshold: Base intensity threshold value (Phred quality scores 0 to 40, default=10)

consec: Minimum number of consecutive bases passing threshold values (default=20)

ASCII encoding type: Type of ASCII encoding: 33 (standard) or 64 (illumina) (default=64)

gregpinero commented 12 years ago

Peter recommends these defaults:

I recommend that we go with QV>=20 and demand 30 consecutive bases. I have seen systematic problems using the 20 consecutive bp default.

gregpinero commented 12 years ago

Question, do we want to filter only the reads.fq file or also the parent reads? I'm assuming all.

gregpinero commented 12 years ago

Note to self:

msg/TQSfastq.py -f reads.fq -t 20 -c 30 -q -z -o reads.fq creates: reads.fq.trim.fastq