djhn75 / RNAEditor

14 stars 15 forks source link

Why BWA and not spliced mapper? #5

Closed solyris closed 6 years ago

solyris commented 7 years ago

Hi, I am just wondering why BWA is picked as the aligner in this package and not a spliced aware aligner like STAR or Tophat?

djhn75 commented 7 years ago

Hi, we are currently working to integrate star into the pipeline. But in the current pipeline it is ok to use a non splice aware mapper, because we do a local realignment afterwards, this diminishes splicing errors.

David

Am 08.03.2017 um 08:16 schrieb solyris:

Hi, I am just wondering why BWA is picked as the aligner in this package and not a spliced aware aligner like STAR or Tophat?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/djhn75/RNAEditor/issues/5, or mute the thread https://github.com/notifications/unsubscribe-auth/AFZH8ot9MPwhcUc-GSCXNovoXYrxHiudks5rjlXOgaJpZM4MWctq.

{"api_version":"1.0","publisher":{"api_key":"05dde50f1d1a384dd78767c55493e4bb","name":"GitHub"},"entity":{"external_key":"github/djhn75/RNAEditor","title":"djhn75/RNAEditor","subtitle":"GitHub repository","main_image_url":"https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png","avatar_image_url":"https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png","action":{"name":"Open in GitHub","url":"https://github.com/djhn75/RNAEditor"}},"updates":{"snippets":[{"icon":"DESCRIPTION","message":"Why BWA and not spliced mapper? (#5)"}],"action":{"name":"View Issue","url":"https://github.com/djhn75/RNAEditor/issues/5"}}}

-- David John Institute of Cardiovascular Regeneration Theodor-Stern-Kai 7 D-60590 Frankfurt am Main +49 69 6301 87952 John@med.uni-frankfurt.de

solyris commented 7 years ago

Hi David,

Thanks for the prompt reply. I have some problems with RNAEditor use with regards to the BWA in the pipeline. In particular I think the BWA used in your package is made to handle only reads length of 30-45bp only and for my project which is RNA-seq of length 100bp, the mapping efficiency is abysmal at only 1% mappable reads. I have mapped this set of data with STAR in a prior exercise and gotten over 90% mappable reads. This issue I think is caused by the BWA aln mode used in your pipeline which the later BWA supports longer reads with the BWA mem mode instead. I hope you can consider updating the pipeline to support long reads too instead of the short reads only.

I am trying to work around this as well by changing the MapFastq.py file, may I know if that is the correct direction to take? Any advise on other files which needs updating is highly appreciated as well.

djhn75 commented 7 years ago

Dear Solyris, yes the MapFastq.py is the correct file to chnage the mapper. You have to adfjust line number 85,90,95 for paire-end and 100,106 for single-end. But please make sure to integrate the following read Groups "@RG\tID:bwa\tSM:A\tPL:ILLUMINA\tPU:HiSEQ2000"

If you suceed please let me know, then i would be happy to integrate it as a new branch.