bcgsc / transabyss

de novo assembly of RNA-seq data using ABySS
Other
34 stars 14 forks source link

How to use trans-abyss to deal with FR reads? #5

Closed zhaoletian closed 8 years ago

zhaoletian commented 8 years ago

Hello,

My RNA-seq data is strand-specific and is FR, and I find the trans-abyss (or abyss) option "--SS" expect the RF read (that is /1 read reverse, /2 read forward). Is there any existed method to handle it? I've searched some forums like biostars, but no helpful results found.

At the moment, I only want to de novo assembly the RNA-Seq data. So what I care is whether use "--SS" to handle FR input data is proper for the assembly.

kmnip commented 8 years ago

Sorry, there is no existing method to handle it... You can workaround this by switching the /1 /2 postfix of the read names in the FASTQ files. So, your /1 reads become /2 reads while the /2 reads becomes /1 reads.

On Wed, May 25, 2016 at 1:36 AM, zhaoletian notifications@github.com wrote:

Hello,

My RNA-seq data is strand-specific and is FR, and I find the trans-abyss (or abyss) option "--SS" expect the RF read (that is /1 read reverse, /2 read forward). I'm a newbie in RNA-Seq, is there any existed method to handle it? I know this is a simple question, and I've searched some forums like biostars, but no helpful results found.

— You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub https://github.com/bcgsc/transabyss/issues/5

zhaoletian commented 8 years ago

OK, thanks~

sjackman commented 8 years ago

@kmnip I thought typical strand specific data was FR like pair-end data, not RF like mate pair data. What's typical?

kmnip commented 8 years ago

@sjackman As far as I know, it is typically RF, like we illustrated on our wiki: https://github.com/bcgsc/transabyss/wiki#17-strand-specific-assembly

The strand specific reads from the GSC and the ENCODE project are in this orientation.

sjackman commented 8 years ago

Ah, I see, we're just using different notations. Thanks for the clarification. In genomic reads, RF refers to outie-pointing reads <--- ---> and FR to innie-pointing reads ---> <---. Both of the read formats that you're describing are innie-pointing FR reads, it's just a question of whether it's F1R2 or F2R1.

kmnip commented 8 years ago

Yes, I like your notation better! ABySS (1.5.*) and Trans-ABySS only support F2R1 when strand-specific mode is turned on. The assembled contigs would need to be reverse-complemented for F1R2 reads.