Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
347 stars 79 forks source link

Meaning of --stranded #101

Closed jasonsydes closed 4 years ago

jasonsydes commented 4 years ago

Hi there!

We're interested in using BRAKER to annotate UTRs. I'm trying to understand how the --stranded option works. I've seen issues #7 and #60, as well as your recommendation of https://www.biostars.org/p/92935/ on how to separate stranded RNA-seq into stranded BAMs. I've also looked through the code a bit.

(UPDATE: I should note that everything I say below is with respect to the strand on the genome; I am not referring to the mRNA strand below at all.)

It seems, after looking at your code, that it does not matter if read pairs are kept together or are separated. Is that true? (In particular, I was looking at function bam2stranded_wig() in braker.pl, where "plus", "minus" and "unstranded" are sorted and stored separately.) For example, you could have a different number of "plus" reads than "minus" reads. Is that true?

Moving on, say you had separated out the reads per https://www.biostars.org/p/92935/. If one were to specify the following:

--stranded=+,- --bam=readsA.bam,readsB.bam

then one should interpret the above to mean "BRAKER treats everything in readsA.bam as originating from the '+' strand (regardless of the strand specified in FLAG for each read), and similarly treats everything in readsB.bam as originating from the '-' strand". Is that correct?

Thanks! Jason

KatharinaHoff commented 4 years ago

The stranded feature hasn't really been tested much... and it's a very messy.

But you understand the intention correctly.

What we need is a compilation of all reads that produce coverage for mRNAs on one of the strands (well, and another file with those for the other strand). I do not check whether there are more reads on one strand, than on another.

jasonsydes commented 4 years ago

Thank you very much for your feedback @KatharinaHoff, this is very helpful!! Thank you again for BRAKER!!