Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
334 stars 80 forks source link

Can BRAKER3 delete .sam file when complete samtools sort step? #778

Open changchuanjun opened 3 months ago

changchuanjun commented 3 months ago

Hello, thanks your team for developing BRAKER pipeline accessions which promote community to finish gene structure annotation. When I ran BRAKER3(v) pipeline to annotate my genome,I found this programme could not delete .sam file when complete samtools sort step which can convert sam file to bam file. sam file is too large and occupies too much space(dozens of Gb).

KatharinaHoff commented 3 months ago

If you deal with huge amounts of RNA-Seq, you can align, convert, merge, and sort before providing the merged BAM file to BRAKER. That's how I usually do it.

Note that adding more and more RNA-Seq does not necessarily improve results. I sometimes alternatively skip technical and biological replicates...

changchuanjun commented 3 months ago

@KatharinaHoff , Thank you for your response. When aligning my RNA-seq data to the corresponding soft-masked genome, I noticed only a slight difference compared to the hard-masked genome. Is it correct?

KatharinaHoff commented 3 months ago

Yes the results will differ. They may even differ drastically. The aligners usually ignore softmasking, it's like unmasked.

On Thu, Mar 14, 2024 at 2:15 PM Changchuanjun @.***> wrote:

@KatharinaHoff https://github.com/KatharinaHoff , Thank you for your response. When aligning my RNA-seq data to the corresponding soft-masked genome, I noticed only a slight difference compared to the hard-masked genome. Is it correct?

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/778#issuecomment-1997437652, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JGNZBNUBBBHTTXUJWLYYGPG7AVCNFSM6AAAAABEKG6KD2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJXGQZTONRVGI . You are receiving this because you were mentioned.Message ID: @.***>

changchuanjun commented 3 months ago

In my result, the predict result (the number of gene) of BRAKER3 is slightly different between hard masked genome and soft masked genome. Is it correct?

KatharinaHoff commented 3 months ago

It depends on the genome. Differences can be small or huge

Changchuanjun @.***> schrieb am Do. 14. März 2024 um 14:37:

In my result, the predict result (the number of gene) of BRAKER3 is slightly different between hard masked genome and soft masked genome. Is it correct?

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/778#issuecomment-1997480074, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JDBELNEE47HB6E2AELYYGRZBAVCNFSM6AAAAABEKG6KD2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOJXGQ4DAMBXGQ . You are receiving this because you were mentioned.Message ID: @.***>