Open libradaatencio opened 2 years ago
You don't need a valid species name that I know of - anything in the quotes should work - can you try to use single quotes? Name of the file doesn't matter
If you are submitting to a cluster sometimes quotes are stripped. On our old cluster I think I had to do something like: -s '"'Genus species'"'
yeah if you are doing it through a job script you might have to play with that. This works on our slurm cluster https://github.com/stajichlab/funannotate_template/blob/main/pipeline/03_predict.sh
Hello, Thanks for your help. Here I share with you the command used for funannotate predict (in a cluster) and the log file. I am working with a fungal genome assembly. The genome was sequenced using Oxford Nanopore.
[Apr 19 10:00 AM]: OS: CentOS Linux 7, 20 cores, ~ 197 GB RAM. Python: 3.7.11
[Apr 19 10:00 AM]: Running funannotate v1.8.9
[Apr 19 10:00 AM]: Skipping CodingQuarry as $QUARRY_PATH not found as ENV
[Apr 19 10:00 AM]: Parsed training data, run ab-initio gene predictors as follows:
Program Training-Method
augustus busco
genemark selftraining
glimmerhmm busco
snap busco
[Apr 19 10:00 AM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Apr 19 10:00 AM]: Genome loaded: 137 scaffolds; 58,238,831 bp; 8.11% repeats masked
[Apr 19 10:00 AM]: Mapping 553,202 proteins to genome using diamond and exonerate
[Apr 19 10:21 AM]: Found 529,006 preliminary alignments --> aligning with exonerate
Progress: 1.22%
Progress: 2.46%
Progress: 3.74%
Progress: 4.51%
Progress: 5.65%
Progress: 6.61%
Progress: 7.83%
Progress: 8.84%
Progress: 10.11%
Progress: 10.88%
Progress: 12.10%
Progress: 13.14%
Progress: 14.36%
Progress: 16.83%
Progress: 19.32%
Progress: 21.76%
P
Progress: 24.96%
Pr
Progress: 26.44%
Pro
Progress: 28.67%
Prog
Progress: 31.19%
Progr
Progress: 33.49%
Progre
Progres
Progress: 39.19%
Progress: 41.78%
Progress: 44.39%
Progress: 47.01%
Progress: 49.18%
P
Progress: 51.45%
Pr
Progress: 54.06%
Pro
Progress: 56.34%
Prog
Progress: 58.76%
Progr
Progress: 61.45%
Progre
Progres
Progress: 66.86%
Progress: 69.28%
Progress: 71.72%
Progress: 74.16%
Progress: 76.80%
P
Progress: 79.37%
Pr
Progress: 81.48%
Pro
Progress: 83.55%
Prog
Progress: 85.66%
Progr
Progress: 88.21%
Progre
Progres
Progress: 93.97%
Progress: 96.23%
Progress: 98.50%
finished: found 1,485 alignments Apr 19 11:05 AM: Running GeneMark-ES on assembly perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LANGUAGE = (unset), LC_ALL = (unset), LC_CTYPE = "UTF-8", LANG = "en_US.UTF-8" are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). [Apr 19 11:24 AM]: 14,395 predictions from GeneMark [Apr 19 11:24 AM]: Running BUSCO to find conserved gene models for training ab-initio predictors [Apr 19 11:33 AM]: 11 valid BUSCO predictions found, validating protein sequences [Apr 19 11:33 AM]: 11 BUSCO predictions validated [Apr 19 11:33 AM]: Not enough gene models 11 to train Augustus (200 required), exiting
augustus: 3.4.0 is incompatible with the internal BUSCO in funannotate. downgrade augustus to < 3.4.
Are you using the latest release? I am using funannotate v1.8.9 for structural and functional annotation of a endophytic fungi. I am following the tutorial for Genome only. The assembly was cleaned, sorted and masked, I did not had problems executing this part of the pipeline.
What command did you issue? funannotate predict -i LCM1078_masked.fasta -o fun --species “Menisporopsis coffea” --strain LCM1078 --busco_seed_species neurospora_crassa --cpus 12
Describe the bug usage: funannotate-predict.py [options] -i genome.fasta funannotate-predict.py: error: unrecognized arguments: coffea”
Questions: I don’t have a specific species name, What can I do if the strain does not have a species name? I also tried “Menisporopsis sp” Should I try with a species name that already exists?
I am using the input file: LCM1078_masked.fasta (masked assembly), Should I change the name to genome.fasta to be recognized by the command?
OS/Install Information
funannotate check --show-versions
Checking dependencies for 1.8.9
You are running Python v 3.7.11. Now checking python packages... biopython: 1.79 goatools: 1.1.12 matplotlib: 3.5.1 natsort: 8.1.0 numpy: 1.21.5 pandas: 1.3.5 psutil: 5.9.0 requests: 2.27.1 scikit-learn: 1.0.2 scipy: 1.7.3 seaborn: 0.11.2 All 11 python packages installed
You are running Perl v b'5.016003'. Now checking perl modules... Bio::Perl: 1.7.4 Carp: 1.26 Clone: 0.45 DBD::SQLite: 1.39 DBD::mysql: 4.023 DBI: 1.627 DB_File: 1.83 Data::Dumper: 2.145 File::Basename: 2.84 File::Which: 1.27 Getopt::Long: 2.4 Hash::Merge: 0.302 JSON: 2.59 LWP::UserAgent: 6.05 Logger::Simple: 2.0 POSIX: 1.30 Parallel::ForkManager: 2.02 Pod::Usage: 1.63 Scalar::Util::Numeric: 0.40 Storable: 2.45 Text::Soundex: 3.04 Thread::Queue: 3.02 Tie::File: 0.98 URI::Escape: 3.31 YAML: 0.84 threads: 1.87 threads::shared: 1.43 All 27 Perl modules installed
Checking Environmental Variables... $FUNANNOTATE_DB=/data/funannotate_db $TRINITY_HOME=/opt/trinityrnaseq-v2.8.6 $EVM_HOME=/opt/EVidenceModeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/opt/Augustus/config $GENEMARK_PATH=/opt/gmes_linux_64 ERROR: PASAHOME not set. export PASAHOME=/path/to/dir
Checking external dependencies... CodingQuarry: 2.0 Trinity: 2.8.6 augustus: 3.4.0 bamtools: bamtools 2.5.2 bedtools: bedtools v2.30.0 blat: BLAT v37x1 diamond: 2.0.13 emapper.py: 2.1.6-43-gd6e6cdf ete3: 3.1.2 exonerate: exonerate 2.2.0 fasta: no way to determine glimmerhmm: 3.0.4 gmap: 2021-12-17 gmes_petap.pl: 4.68_lic hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 1.8.0_302 kallisto: 0.46.1 mafft: v7.490 (2021/Oct/30) makeblastdb: makeblastdb 2.11.0+ minimap2: 2.24-r1122 proteinortho: 6.0.33 pslCDnaFilter: no way to determine salmon: salmon 1.6.0 samtools: samtools 1.11 signalp: seqfile snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.9 (July 2021) tantan: tantan 26 tbl2asn: no way to determine, likely 25.X tblastn: tblastn 2.11.0+ trimal: trimAl 1.2rev59 ERROR: trimmomatic not installed
thanks for your help. Librada Atencio