nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
317 stars 83 forks source link

funannotate-predict.py: error: unrecognized arguments: --stopCodonExcludedFromCDS=False #1064

Open wangpeng-design opened 2 weeks ago

wangpeng-design commented 2 weeks ago

Are you using the latest release? If you are not using the latest release of funannotate, please upgrade, if bug persists then report here. v1.8.13 Describe the bug A clear and concise description of what the bug is. funannotate-predict.py: error: unrecognized arguments: --stopCodonExcludedFromCDS=False What command did you issue? Copy/paste the command used. funannotate predict -i PGChrGenome.softmask.fasta --species "Pleurotus ostreatus" --transcript_alignments transcript_alignments.gff3:8 --protein_alignments protein_alignments.gff:4 --augustus_gff gene_predictions.gff:1 --trnascan tRNA.out -o output_folder --stopCodonExcludedFromCDS=False Logfiles Please provide relavent log files of the error.

OS/Install Information

You are running Perl v b'5.026002'. Now checking perl modules... Carp: 1.38 Clone: 0.42 DBD::SQLite: 1.64 DBD::mysql: 4.046 DBI: 1.642 DB_File: 1.855 Data::Dumper: 2.173 File::Basename: 2.85 File::Which: 1.23 Getopt::Long: 2.5 Hash::Merge: 0.300 JSON: 4.02 LWP::UserAgent: 6.39 Logger::Simple: 2.0 POSIX: 1.76 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.12 Tie::File: 1.02 URI::Escape: 3.31 YAML: 1.29 threads: 2.15 threads::shared: 1.56 ERROR: local::lib not installed, install with cpanm local::lib

Checking Environmental Variables... $FUNANNOTATE_DB=/public/home/bs20233171040/Genomic_data/funannotate_db $PASAHOME=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/opt/pasa-2.4.1 $TRINITY_HOME=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/opt/trinity-2.8.5 $EVM_HOME=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/config/ $GENEMARK_PATH=/public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/bin/gmes_petap.pl All 6 environmental variables are set

Checking external dependencies... samtools: /public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/bin/../lib/libtinfow.so.6: no version information available (required by samtools) samtools: /public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/bin/../lib/libncursesw.so.6: no version information available (required by samtools) samtools: /public/home/bs20233171040/$/public/home/bs20233171040/software/anaconda3/envs/funannotate/bin/../lib/libncursesw.so.6: no version information available (required by samtools) PASA: 2.4.1 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.4.0 bamtools: bamtools 2.5.1 bedtools: bedtools v2.30.0 blat: BLAT v35 diamond: 2.1.8 emapper.py: 2.1.12 ete3: 3.1.3 exonerate: exonerate 2.4.0 fasta: no way to determine glimmerhmm: 3.0.4 gmap: 2017-11-15 gmes_petap.pl: 4.33 hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 11.0.9.1-internal kallisto: 0.46.1 mafft: v7.525 (2024/Mar/13) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.28-r1209 pigz: pigz 2.8 proteinortho: 6.0.34 pslCDnaFilter: no way to determine salmon: salmon 0.14.1 samtools: samtools 1.15.1 signalp: environment. snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.9 (July 2021) tantan: tantan 31 tbl2asn: no way to determine, likely 25.X tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 All 37 external dependencies are installed

hyphaltip commented 2 weeks ago

thats an augustus parameter not a funannotate parameter so you should not provide it.

The directions which mention that cmdline parameter are telling you how to run augustus OUTSIDE of funannotate and then provide a GFF file to funannotate if you want to do it in your own way. However if you are running funannotate normally where it will train and run augustus for your automatically then this parameter is already sent to augustus.

hyphaltip commented 2 weeks ago

here are all the cmdline options to predict:

funannotate predict

Usage:       funannotate predict <arguments>
version:     1.8.17

Description: Script takes genome multi-fasta file and a variety of inputs to do a comprehensive whole
             genome gene prediction.  Uses AUGUSTUS, GeneMark, Snap, GlimmerHMM, BUSCO, EVidence Modeler,
             tbl2asn, tRNAScan-SE, Exonerate, minimap2.
Required:
  -i, --input              Genome multi-FASTA file (softmasked repeats)
  -o, --out                Output folder name
  -s, --species            Species name, use quotes for binomial, e.g. "Aspergillus fumigatus"

Optional:
  -p, --parameters         Ab intio parameters JSON file to use for gene predictors
  --isolate                Isolate name, e.g. Af293
  --strain                 Strain name, e.g. FGSCA4
  --name                   Locus tag name (assigned by NCBI?). Default: FUN_
  --numbering              Specify where gene numbering starts. Default: 1
  --maker_gff              MAKER2 GFF file. Parse results directly to EVM.
  --pasa_gff               PASA generated gene models. filename:weight
  --other_gff              Annotation pass-through to EVM. filename:weight
  --rna_bam                RNA-seq mapped to genome to train Augustus/GeneMark-ET
  --stringtie              StringTie GTF result
  -w, --weights            Ab-initio predictor and EVM weight. Example: augustus:2 or pasa:10
  --augustus_species       Augustus species config. Default: uses species name
  --min_training_models    Minimum number of models to train Augustus. Default: 200
  --genemark_mode          GeneMark mode. Default: ES [ES,ET]
  --genemark_mod           GeneMark ini mod file
  --busco_seed_species     Augustus pre-trained species to start BUSCO. Default: anidulans
  --optimize_augustus      Run 'optimze_augustus.pl' to refine training (long runtime)
  --busco_db               BUSCO models. Default: dikarya. `funannotate outgroups --show_buscos`
  --organism               Fungal-specific options. Default: fungus. [fungus,other]
  --ploidy                 Ploidy of assembly. Default: 1
  -t, --tbl2asn            Assembly parameters for tbl2asn. Default: "-l paired-ends"
  -d, --database           Path to funannotate database. Default: $FUNANNOTATE_DB

  --protein_evidence       Proteins to map to genome (prot1.fa prot2.fa uniprot.fa). Default: uniprot.fa
  --protein_alignments     Pre-computed protein alignments in GFF3 format
  --p2g_pident             Exonerate percent identity. Default: 80
  --p2g_diamond_db         Premade diamond genome database for protein2genome mapping
  --p2g_prefilter          Pre-filter hits software selection. Default: diamond [tblastn]
  --transcript_evidence    mRNA/ESTs to align to genome (trans1.fa ests.fa trinity.fa). Default: none
  --transcript_alignments  Pre-computed transcript alignments in GFF3 format
  --augustus_gff           Pre-computed AUGUSTUS GFF3 results (must use --stopCodonExcludedFromCDS=False)
  --genemark_gtf           Pre-computed GeneMark GTF results
  --trnascan               Pre-computed tRNAscanSE results

  --min_intronlen          Minimum intron length. Default: 10
  --max_intronlen          Maximum intron length. Default: 3000
  --soft_mask              Softmasked length threshold for GeneMark. Default: 2000
  --min_protlen            Minimum protein length. Default: 50
  --repeats2evm            Use repeats in EVM consensus model building
  --keep_evm               Keep existing EVM results (for rerunning pipeline)
  --evm-partition-interval Min length between genes to make a partition: Default: 1500
  --no-evm-partitions      Do not split contigs into partitions
  --repeat_filter          Repetitive gene model filtering. Default: overlap blast [overlap,blast,none]
  --keep_no_stops          Keep gene models without valid stops
  --SeqCenter              Sequencing facilty for NCBI tbl file. Default: CFMR
  --SeqAccession           Sequence accession number for NCBI tbl file. Default: 12345
  --force                  Annotated unmasked genome
  --cpus                   Number of CPUs to use. Default: 2
  --no-progress            Do not print progress to stdout for long sub jobs
  --tmpdir                 Volume/location to write temporary files. Default: /tmp
  --header_length          Maximum length of FASTA headers. Default: 16

ENV Vars:  If not specified at runtime, will be loaded from your $PATH
  --EVM_HOME
  --AUGUSTUS_CONFIG_PATH
  --GENEMARK_PATH
  --BAMTOOLS_PATH
wangpeng-design commented 2 weeks ago

Thank you!