BRAKER is a pipeline for fully automated prediction of protein coding gene structures with GeneMark-ES/ET/EP/ETP and AUGUSTUS in novel eukaryotic genomes
Other
355
stars
79
forks
source link
BRAKER manual confusion about --UTR and --addUTR #370
I'm copying the relevant manual text below for ease of reference. Comments describing my confusion are added in bold in square brackets
//
--UTR=on
Generate UTR training examples for AUGUSTUS from RNA-Seq coverage information, train AUGUSTUS UTR parameters and predict genes with AUGUSTUS and UTRs, including coverage information for RNA-Seq as evidence. This flag only works if --softmasking is also enabled. This is an experimental feature![I understand this to mean, --UTR is for predicting UTRs as part of a standard BRAKER run]
[However, this next line seems to be about adding UTRs to an existing output -- a function I thought --addUTR was for. And indeed it shows the --addUTR option]
If you performed a BRAKER run without --UTR=on, you can add UTR parameter training and gene prediction with UTR parameters (and only RNA-Seq hints) with the following command:
[The command above is identical to the --addUTR command shown below]
Modify augustus.hints.gtf to point to the AUGUSTUS predictions with hints from previous BRAKER run; modify flaning_DNA value to the flanking region from the log file of your previous BRAKER run; modify some_new_working_directory to the location where BRAKER should store results of the additional BRAKER run; modify somespecies to the species name used in your previous BRAKER run.
['flaning_DNA' is a typo for --flanking_DNA but in any case I do not see the option included in either the command for --UTR above or --addUTR below]
--addUTR=on
Add UTRs from RNA-Seq converage information to AUGUSTUS gene predictions using GUSHR. No training of UTR parameters and no gene prediction with UTR parameters is performed.
If you performed a BRAKER run without --addUTR=on, you can add UTRs results of a previous BRAKER run with the following command:
[this command is identical to the one given for --UTR above]
Modify augustus.hints.gtf to point to the AUGUSTUS predictions with hints from previous BRAKER run; modify some_new_workingdirectory to the location where BRAKER should store results of the additional BRAKER run; this run will not modify AUGUSTUS parameters. We recommend that you specify the original species of the original run with --species=somespecies. Otherwise, BRAKER will create an unneeded species parameters directory Sp*.
[is it crucial to use the original species of the original run, or does it not really matter, if space for the 'unneeded directory Sp_' is not an issue?]
//
[In sum I am confused as to when to use --UTR versus --addUTR, what is the proper command option set for each, which one requires --flanking_DNA, and when the original species is required]
I'm copying the relevant manual text below for ease of reference. Comments describing my confusion are added in bold in square brackets
//
--UTR=on
Generate UTR training examples for AUGUSTUS from RNA-Seq coverage information, train AUGUSTUS UTR parameters and predict genes with AUGUSTUS and UTRs, including coverage information for RNA-Seq as evidence. This flag only works if --softmasking is also enabled. This is an experimental feature! [I understand this to mean, --UTR is for predicting UTRs as part of a standard BRAKER run]
[However, this next line seems to be about adding UTRs to an existing output -- a function I thought --addUTR was for. And indeed it shows the --addUTR option] If you performed a BRAKER run without --UTR=on, you can add UTR parameter training and gene prediction with UTR parameters (and only RNA-Seq hints) with the following command:
[The command above is identical to the --addUTR command shown below]
Modify augustus.hints.gtf to point to the AUGUSTUS predictions with hints from previous BRAKER run; modify flaning_DNA value to the flanking region from the log file of your previous BRAKER run; modify some_new_working_directory to the location where BRAKER should store results of the additional BRAKER run; modify somespecies to the species name used in your previous BRAKER run. ['flaning_DNA' is a typo for --flanking_DNA but in any case I do not see the option included in either the command for --UTR above or --addUTR below]
--addUTR=on
Add UTRs from RNA-Seq converage information to AUGUSTUS gene predictions using GUSHR. No training of UTR parameters and no gene prediction with UTR parameters is performed.
If you performed a BRAKER run without --addUTR=on, you can add UTRs results of a previous BRAKER run with the following command:
[this command is identical to the one given for --UTR above]
Modify augustus.hints.gtf to point to the AUGUSTUS predictions with hints from previous BRAKER run; modify some_new_workingdirectory to the location where BRAKER should store results of the additional BRAKER run; this run will not modify AUGUSTUS parameters. We recommend that you specify the original species of the original run with --species=somespecies. Otherwise, BRAKER will create an unneeded species parameters directory Sp*. [is it crucial to use the original species of the original run, or does it not really matter, if space for the 'unneeded directory Sp_' is not an issue?] //
[In sum I am confused as to when to use --UTR versus --addUTR, what is the proper command option set for each, which one requires --flanking_DNA, and when the original species is required]