Gaius-Augustus / BRAKER

BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
367 stars 81 forks source link

Braker 1 and Braker 2 #858

Closed RacheliHadjez closed 1 month ago

RacheliHadjez commented 2 months ago

Hello, I am currently working on the genome annotation of a myxozoan species and would like to use BRAKER for this purpose. Based on a BRAKER tutorial, I understand that the optimal approach for my species is to run BRAKER1 with RNA-seq data and BRAKER2 with protein data, and then use TSEBRA to combine the results from both runs.

However, I noticed that on the Galaxy server, only BRAKER2 and BRAKER3 are available, and BRAKER2 has an option to input both RNA-seq and protein data. My question is: can I use BRAKER2 with both RNA-seq and protein data instead of running BRAKER1 separately, and then use TSEBRA on the results? Or does it specifically need to be BRAKER1, requiring me to run it on a Linux server?

Thank you in advance for your help! Rachel

KatharinaHoff commented 2 months ago

This is very confusing. BRAKER3 works with both RNA-Seq data AND a protein database. BRAKER2 works with proteins only. However, I have seen the same confusion elsewhere, earlier this week. Is it possibly labelled incorrectly in Galaxy? If that is the case, please let me know (we do not maintain this ourselves, but we can contact the authors of the Galaxy workflow).

Use BRAKER3 if you have both RNA-Seq and protein data. (If it is labelled incorrectly in Galaxy... well... use the tool that allows upload of both data, for now). BRAKER1 as well as the BRAKER1/BRAKER2/TSEBRA approach do not have as high accuracy as BRAKER3. See https://genome.cshlp.org/content/early/2024/05/28/gr.278090.123.abstract

On Wed, Sep 18, 2024 at 9:25 AM RacheliHadjez @.***> wrote:

Hello, I am currently working on the genome annotation of a myxozoan species and would like to use BRAKER for this purpose. Based on a BRAKER tutorial, I understand that the optimal approach for my species is to run BRAKER1 with RNA-seq data and BRAKER2 with protein data, and then use TSEBRA to combine the results from both runs.

However, I noticed that on the Galaxy server, only BRAKER2 and BRAKER3 are available, and BRAKER2 has an option to input both RNA-seq and protein data. My question is: can I use BRAKER2 with both RNA-seq and protein data instead of running BRAKER1 separately, and then use TSEBRA on the results? Or does it specifically need to be BRAKER1, requiring me to run it on a Linux server?

Thank you in advance for your help! Rachel

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/858, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JF5T6P22WSY4JYDSILZXETIHAVCNFSM6AAAAABONAOKM2VHI2DSMVQWIX3LMV43ASLTON2WKOZSGUZTEOJUGEYTAMQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

RacheliHadjez commented 2 months ago

Thank you for your response! I'm new to gene annotation, so I may have been a bit confused. I saw in the BRAKER tutorial that BRAKER3 is recommended for large genome sizes and the genome I'm working on is ~60Mb. I will try BRAKER3 then. Thank you once again! Rachel

KatharinaHoff commented 2 months ago

BRAKER1 and BRAKER2 do not work well on large genomes. BRAKER3 does not have the issue. It should do ok on genomes of all sizes.

RacheliHadjez @.***> schrieb am Do. 19. Sept. 2024 um 16:27:

Thank you for your response! I'm new to gene annotation, so I may have been a bit confused. I saw in the BRAKER tutorial that BRAKER3 is recommended for large genome sizes and the genome I'm working on is ~60Mb. I will try BRAKER3 then. Thank you once again! Rachel

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/858#issuecomment-2361151535, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JDEVLZRWXAPWD3TEZ3ZXLNM3AVCNFSM6AAAAABONAOKM2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRRGE2TCNJTGU . You are receiving this because you commented.Message ID: @.***>

RacheliHadjez commented 2 months ago

Okay thank you! I also wanted to ask: is it an issue to use BRAKER3 if my data consists of PacBio long reads? Both my DNA and RNA data are from PacBio.

KatharinaHoff commented 2 months ago

There is a poster in the docs folder that contains instructions for PacBio (only) transcriptome data.

RacheliHadjez @.***> schrieb am So. 22. Sept. 2024 um 10:01:

Okay thank you! I also wanted to ask: is it an issue to use BRAKER3 if my data consists of PacBio long reads? Both my DNA and RNA data are from PacBio.

— Reply to this email directly, view it on GitHub https://github.com/Gaius-Augustus/BRAKER/issues/858#issuecomment-2365908110, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJMC6JFXQ6XAQ3R2GNUUTHDZXZ2PFAVCNFSM6AAAAABONAOKM2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNRVHEYDQMJRGA . You are receiving this because you commented.Message ID: @.***>

RacheliHadjez commented 2 months ago

Thank you very much!!!