Braker with RNA-seq: soft- vs hardmasking

Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes

Other

363 stars 81 forks source link

Dear authors,

I am trying to use Braker on RNA-seq data, and I have a question about your recommendation to use a softmasked genome.

In your tutorial for Augustus (https://github.com/Gaius-Augustus/Augustus/blob/master/docs/tutorial2018/index.html) you mapped RNA-seq reads with STAR against the hardmasked genome, but then used the softmasked reference version for Augustus and Braker.

Should the same be done when using Braker2? If yes, can you please explain what the advantage and reason is doing it this way over only using the softmasked reference throughout (i.e. mapping with STAR + BRAKER)?

I am using RepeatModeler2 to identify repeats, RepeatMasker for masking, and STAR to align paired-end RNA-seq reads.

Many thanks, Daniel

Gaius-Augustus / BRAKER

Braker with RNA-seq: soft- vs hardmasking #188