chapmanb / bcbb

Incubator for useful bioinformatics code, primarily in Python and R
http://bcbio.wordpress.com
604 stars 243 forks source link

Fixes to Tophat/Bowtie2 #67

Closed roryk closed 11 years ago

roryk commented 11 years ago

1) Look for bowtie path as "bowtie" not "bowtie2" in program: when using bowtie2. 2) Fixed a dictionary getting mutated problem when adding options. 3) Bowtie2 does not seem to respect the -X parameter:

bowtie2  -X 2000 -x test/data/bowtie2/e_coli -1 test/data/s_1_1_10k.fq -2 test/data/s_1_2_10k.fq -S wtf.sam
SLXA-EAS1_89:1:1:672:654        99      Chromosome      3919889 42      35M     =       3920066 212     GCTACGGAATAAAACCAGGAACAACAGACCCAGCA     cccccccccccccccccccc]c``cVcZccbSYbY     AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:35 YS:i:0  YT:Z:CP
SLXA-EAS1_89:1:1:672:654        147     Chromosome      3920066 42      35M     =       3919889 -212    TGAAGCCATGATGCCTTTTACCCTTTGTTGTTAAT     Z````[[`[b`bb^bb[^[`bbb`Ubbb`bbbbbb     AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:35 YS:i:0  YT:Z:CP
SLXA-EAS1_89:1:1:968:480    65  Chromosome  6795    42  35M =   2854300 2847540 AACACCAGATCGCTTTAGGGTTGTTCAGGCGTAAA cccccccccccccccccccccc`cccc``c```^` AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:35 YS:i:0  YT:Z:DP
SLXA-EAS1_89:1:1:968:480    129 Chromosome  2854300 42  35M =   6795    -2847540    GTTGATTGTGCGCGTGCTGAAAGAAACCAACGGCG cccccccccccccccccc]cccccccccccccccc AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:35 YS:i:0  YT:Z:DP

For the time being I changed the inner distance calculation around to use the median and remove any fragments > 3x the standard deviation but now that I write this, I realize I should just manually do -X when making the calculation.

chapmanb commented 11 years ago

Rory; Thanks for these, great stuff. A couple of quick thoughts:

chapmanb commented 11 years ago

Rory; Sorry, missed the bowtie2 updates here. For the -X issue, it looks like bowtie2 also has this option: are you finding it doesn't work correctly? If so, I can merge away with your fixes.

roryk commented 11 years ago

Hi Brad,

Don't merge yet. Bowtie2 is working correctly. The behavior is changed in Bowtie2 so reads failing -X do not get filtered out, they get an optional discordant (YT:Z:DP) or unpaired (YT:Z:UP, if they are on different chromosomes) flag set. So filtering the list on that before calculating the insert sizes should fix the problem. I haven't got around to implementing that yet.

I'll close this pull request and open a clean one since most of the changes to it have been reverted.

Rory

On Jan 30, 2013, at 5:30 AM, Brad Chapman wrote:

Rory; Sorry, missed the bowtie2 updates here. For the -X issue, it looks like bowtie2 also has this option: are you finding it doesn't work correctly? If so, I can merge away with your fixes.

— Reply to this email directly or view it on GitHub.