Open IanDMedeiros opened 2 years ago
Seems like related to these errors, which suggest that BUSCO did not parse the output properly perhaps?
INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092603EH.faa.1):
INFO [hmmersearch] Line 2: illegal character %
This seems like the error you get when you use Augustus 3.4.
So it seems like the Augustus config path ENV variable is still pointing to the conda install $AUGUSTUS_CONFIG_PATH=/hpc/home/idm7/miniconda3/envs/annotate/config/
, so how did you "replace" augustus? You should probably uninstall it, ie I think you can use conda remove but need to use --force to only delete that package. After you remove the conda augustus, you can manually add back the location of your system AUGUSTUS_CONFIG_PATH variable. I'd first try to fix that and then rerun funannotate setup because it copies over Augustus parameters files from that location. Maybe its possible that the parameters files for this version from mamba are incompatible with your v3.3.3?
But perhaps the other thing to try would be to downgrade hmmer to the previous versions as maybe that is reason for the failure? I've never seen this be the problem before - seems like most likely augustus (as it nearly always is augustus issues). The other one for awhile was tblastn
and multithreaded, but seems like you have older version so that shouldn't be an issue.
Thanks for the quick reply—I think the piece I was missing was re-running funannotate setup with the new augustus. (/hpc/home/idm7/miniconda3/envs/annotate/config/ is actually the config directory associated with the new augustus.) With that correction, funannotate predict successfully completed in the test (it is still throwing an error that there are not enough gene models to train augustus, but that is now coming at the very end, after funannotate predict completes). I am now trying the pipeline with actual data.
[editing my comment, I now see that HMMER is there as hmmscan and hmmsearch]
######################################################### Running
funannotate clean` unit testing: minimap2 mediated assembly duplications
Downloading: https://osf.io/8pjbe/download?version=1 Bytes: 252076
[DOWNLOAD PROGRESS OMITTED FOR SPACE]
6 input contigs, 6 larger than 500 bp, N50 is 427,039 bp
Checking duplication of 6 contigs6 input contigs; 6 larger than 500 bp; 3 duplicated; 3 written to file
CMD: funannotate clean -i test.clean.fa -o test.exhaustive.fa --exhaustive
#########################################################
#########################################################
SUCCESS: funannotate clean
test complete.
#########################################################
funannotate mask
unit testing: RepeatModeler --> RepeatMasker
Downloading: https://osf.io/hbryz/download?version=1 Bytes: 375687
[DOWNLOAD PROGRESS OMITTED FOR SPACE]
[Jun 30 01:06 AM]: OS: CentOS Stream 8, 46 cores, ~ 230 GB RAM. Python: 3.8.12
[Jun 30 01:06 AM]: Running funanotate v1.8.11
[Jun 30 01:06 AM]: Soft-masking simple repeats with tantan
[Jun 30 01:06 AM]: Repeat soft-masking finished:
Masked genome: /hpc/group/bio1/ian/envs/test-mask_e3d485dd-d34b-4a2b-b684-f7b62bc53bba/test.masked.fa
num scaffolds: 2
assembly size: 1,216,048 bp
masked repeats: 50,965 bp (4.19%)CMD: funannotate mask -i test.fa -o test.masked.fa --cpus 16
#########################################################
#########################################################
SUCCESS: funannotate mask
test complete.
#########################################################
funannotate predict
unit testing
Downloading: https://osf.io/te2pf/download?version=1 Bytes: 1489808
[DOWNLOAD PROGRESS OMITTED FOR SPACE]
-------------------------------------------------------
[Jun 30 01:06 AM]: OS: CentOS Stream 8, 46 cores, ~ 230 GB RAM. Python: 3.8.12
[Jun 30 01:06 AM]: Running funannotate v1.8.11
[Jun 30 01:06 AM]: Skipping CodingQuarry as no --rna_bam passed
[Jun 30 01:06 AM]: Parsed training data, run ab-initio gene predictors as follows:
[4mProgram Training-Method[0m
augustus pretrainedRun InterProScan (manual install): funannotate iprscan -i annotate -c 16
Run antiSMASH (optional): funannotate remote -i annotate -m antismash -e youremail@server.edu
[Jun 30 01:18 AM]: Training parameters file saved: annotate/predict_results/saccharomyces.parameters.json [Jun 30 01:18 AM]: Add species parameters to database:
funannotate species -s saccharomyces -a annotate/predict_results/saccharomyces.parameters.json
[Jun 30 01:19 AM]: OS: CentOS Stream 8, 46 cores, ~ 230 GB RAM. Python: 3.8.12
[Jun 30 01:19 AM]: Running funannotate v1.8.11
[Jun 30 01:19 AM]: Skipping CodingQuarry as no --rna_bam passed
[Jun 30 01:19 AM]: Parsed training data, run ab-initio gene predictors as follows:
[4mProgram Training-Method[0m
augustus busco
glimmerhmm busco
snap busco
[Jun 30 01:19 AM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Jun 30 01:19 AM]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked
[Jun 30 01:19 AM]: Mapping 1,065 proteins to genome using diamond and exonerate
[Jun 30 01:19 AM]: Found 1,505 preliminary alignments with diamond in 0:00:01 --> generated FASTA files for exonerate in 0:00:00
[PROGRESS UPDATES OMITTED FOR SPACE]
Progress: 99.93%
[Jun 30 01:19 AM]: Exonerate finished in 0:00:19: found 1,270 alignments
[Jun 30 01:19 AM]: Running BUSCO to find conserved gene models for training ab-initio predictors
[Jun 30 01:24 AM]: 175 valid BUSCO predictions found, validating protein sequences
[Jun 30 01:24 AM]: 175 BUSCO predictions validated
[Jun 30 01:24 AM]: Not enough gene models 175 to train Augustus (200 required), exiting
CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 16 --species Awesome testicus
#########################################################
#########################################################
SUCCESS: funannotate predict
test complete.
#########################################################
#########################################################
Running funannotate predict
BUSCO-mediated training unit testing
CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --cpus 16 --species Awesome busco
#########################################################
#########################################################
Traceback (most recent call last):
File "/hpc/home/idm7/miniconda3/envs/annotate/bin/funannotate", line 10, in
No, still failing on real data...
[Jun 30 01:32 AM]: OS: CentOS Stream 8, 12 cores, ~ 33 GB RAM. Python: 3.8.12
[Jun 30 01:32 AM]: Running funannotate v1.8.11
[Jun 30 01:32 AM]: Skipping CodingQuarry as no --rna_bam passed
[Jun 30 01:32 AM]: Parsed training data, run ab-initio gene predictors as follows:
[4mProgram Training-Method[0m
augustus busco
glimmerhmm busco
snap busco
[Jun 30 01:32 AM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Jun 30 01:32 AM]: Genome loaded: 74 scaffolds; 26,734,419 bp; 2.37% repeats masked
[Jun 30 01:32 AM]: Mapping 554,696 proteins to genome using diamond and exonerate
[Jun 30 01:35 AM]: Found 293,585 preliminary alignments with diamond in 0:01:35 --> generated FASTA files for exonerate in 0:00:58
[PROGRESS OMITTED FOR SPACE]
[Jun 30 02:34 AM]: Exonerate finished in 0:59:08: found 1,406 alignments
[Jun 30 02:35 AM]: Running BUSCO to find conserved gene models for training ab-initio predictors
[Jun 30 02:45 AM]: 0 valid BUSCO predictions found, validating protein sequences
Traceback (most recent call last):
File "/hpc/home/idm7/miniconda3/envs/annotate/bin/funannotate", line 10, in
The test is still somewhat failing for the same reason -- a successful test should produce > 200 models from BUSCO. So I think it is still related to augustus.
FYI you can use --no-progress
to suppress that annoying progress meter on HPC.
Here is my successful test:
$ ./funannotate_dev/funannotate-docker test -t predict --cpus 4
#########################################################
Running `funannotate predict` unit testing
CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 4 --species Awesome testicus
#########################################################
-------------------------------------------------------
[Jun 30 09:01 AM]: OS: Debian GNU/Linux 10, 4 cores, ~ 8 GB RAM. Python: 3.8.13
[Jun 30 09:01 AM]: Running funannotate v1.8.12
[Jun 30 09:01 AM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
[Jun 30 09:01 AM]: Skipping CodingQuarry as no --rna_bam passed
[Jun 30 09:01 AM]: Parsed training data, run ab-initio gene predictors as follows:
Program Training-Method
augustus pretrained
glimmerhmm busco
snap busco
[Jun 30 09:01 AM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Jun 30 09:01 AM]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked
[Jun 30 09:01 AM]: Mapping 1,065 proteins to genome using diamond and exonerate
[Jun 30 09:01 AM]: Found 1,505 preliminary alignments with diamond in 0:00:02 --> generated FASTA files for exonerate in 0:00:00
[Jun 30 09:02 AM]: Exonerate finished in 0:00:34: found 1,270 alignments
[Jun 30 09:02 AM]: Running BUSCO to find conserved gene models for training ab-initio predictors
[Jun 30 09:14 AM]: 373 valid BUSCO predictions found, validating protein sequences
[Jun 30 09:15 AM]: 370 BUSCO predictions validated
[Jun 30 09:15 AM]: Running Augustus gene prediction using saccharomyces parameters
[Jun 30 09:18 AM]: 1,485 predictions from Augustus
[Jun 30 09:18 AM]: Pulling out high quality Augustus predictions
[Jun 30 09:18 AM]: Found 371 high quality predictions from Augustus (>90% exon evidence)
[Jun 30 09:18 AM]: Running SNAP gene prediction, using training data: annotate/predict_misc/busco.final.gff3
[Jun 30 09:20 AM]: 1,362 predictions from SNAP
[Jun 30 09:20 AM]: Running GlimmerHMM gene prediction, using training data: annotate/predict_misc/busco.final.gff3
[Jun 30 09:23 AM]: 1,769 predictions from GlimmerHMM
[Jun 30 09:23 AM]: Summary of gene models passed to EVM (weights):
Source Weight Count
Augustus 1 1325
Augustus HiQ 2 372
GlimmerHMM 1 1769
snap 1 1362
Total - 4828
[Jun 30 09:23 AM]: EVM: partitioning input to ~ 35 genes per partition using min 1500 bp interval
[Jun 30 09:34 AM]: Converting to GFF3 and collecting all EVM results
[Jun 30 09:34 AM]: 1,695 total gene models from EVM
[Jun 30 09:34 AM]: Generating protein fasta files from 1,695 EVM models
[Jun 30 09:34 AM]: now filtering out bad gene models (< 50 aa in length, transposable elements, etc).
[Jun 30 09:34 AM]: Found 137 gene models to remove: 0 too short; 0 span gaps; 137 transposable elements
[Jun 30 09:34 AM]: 1,558 gene models remaining
[Jun 30 09:34 AM]: Predicting tRNAs
[Jun 30 09:35 AM]: 112 tRNAscan models are valid (non-overlapping)
[Jun 30 09:35 AM]: Generating GenBank tbl annotation file
[Jun 30 09:35 AM]: Collecting final annotation files for 1,670 total gene models
[Jun 30 09:35 AM]: Converting to final Genbank format
[Jun 30 09:35 AM]: Funannotate predict is finished, output files are in the annotate/predict_results folder
[Jun 30 09:35 AM]: Your next step might be functional annotation, suggested commands:
-------------------------------------------------------
Run InterProScan (manual install):
funannotate iprscan -i annotate -c 4
Run antiSMASH (optional):
funannotate remote -i annotate -m antismash -e youremail@server.edu
Annotate Genome:
funannotate annotate -i annotate --cpus 4 --sbt yourSBTfile.txt
-------------------------------------------------------
[Jun 30 09:35 AM]: Training parameters file saved: annotate/predict_results/saccharomyces.parameters.json
[Jun 30 09:35 AM]: Add species parameters to database:
funannotate species -s saccharomyces -a annotate/predict_results/saccharomyces.parameters.json
#########################################################
SUCCESS: `funannotate predict` test complete.
#########################################################
And then the busco.log:
INFO ****************** Start a BUSCO 2.0 analysis, current time: 06/30/2022 09:02:33 ******************
INFO The lineage dataset is: dikarya_odb9 (eukaryota)
INFO Mode is: genome
INFO Maximum number of regions limited to: 3
INFO To reproduce this run: python /venv/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/genome.softmasked.fa -o saccharomyces -l /opt/databases/dikarya/ -m genome -c 4 -sp anidulans
INFO Check dependencies...
INFO Check input file...
INFO Temp directory is ./tmp/
INFO ****** Phase 1 of 2, initial predictions ******
INFO ****** Step 1/3, current time: 06/30/2022 09:02:33 ******
INFO Create blast database...
INFO [makeblastdb] Building a new DB, current time: 06/30/2022 09:02:33
INFO [makeblastdb] New DB name: /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/busco/tmp/saccharomyces_3039243825
INFO [makeblastdb] New DB title: /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/genome.softmasked.fa
INFO [makeblastdb] Sequence type: Nucleotide
INFO [makeblastdb] Keep Linkouts: T
INFO [makeblastdb] Keep MBits: T
INFO [makeblastdb] Maximum file size: 1000000000B
INFO [makeblastdb] Adding sequences from FASTA; added 6 sequences in 0.0503011 seconds.
INFO Running tblastn, writing output to /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/busco/run_saccharomyces/blast_output/tblastn_saccharomyces.tsv...
INFO ****** Step 2/3, current time: 06/30/2022 09:02:42 ******
INFO Getting coordinates for candidate regions...
INFO Pre-Augustus scaffold extraction...
INFO Running Augustus prediction using anidulans as species:
INFO [augustus] Please find all logs related to Augustus here: /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/busco/run_saccharomyces/augustus_output/augustus.log
INFO 06/30/2022 09:02:42 => 0% of predictions performed (743 to be done)
INFO 06/30/2022 09:04:21 => 10% of predictions performed (75/743 candidate regions)
INFO 06/30/2022 09:05:44 => 20% of predictions performed (149/743 candidate regions)
INFO 06/30/2022 09:07:25 => 30% of predictions performed (223/743 candidate regions)
INFO 06/30/2022 09:08:50 => 40% of predictions performed (298/743 candidate regions)
INFO 06/30/2022 09:09:50 => 50% of predictions performed (372/743 candidate regions)
INFO 06/30/2022 09:10:44 => 60% of predictions performed (446/743 candidate regions)
INFO 06/30/2022 09:11:28 => 70% of predictions performed (521/743 candidate regions)
INFO 06/30/2022 09:12:15 => 80% of predictions performed (595/743 candidate regions)
INFO 06/30/2022 09:12:51 => 90% of predictions performed (669/743 candidate regions)
INFO 06/30/2022 09:13:22 => 100% of predictions performed
INFO Extracting predicted proteins...
INFO ****** Step 3/3, current time: 06/30/2022 09:13:42 ******
INFO Running HMMER to confirm orthology of predicted proteins:
INFO 06/30/2022 09:13:42 => 0% of predictions performed (686 to be done)
INFO 06/30/2022 09:13:43 => 10% of predictions performed (69/686 candidate proteins)
INFO 06/30/2022 09:13:44 => 20% of predictions performed (139/686 candidate proteins)
INFO 06/30/2022 09:13:45 => 30% of predictions performed (208/686 candidate proteins)
INFO 06/30/2022 09:13:46 => 40% of predictions performed (277/686 candidate proteins)
INFO 06/30/2022 09:13:48 => 50% of predictions performed (346/686 candidate proteins)
INFO 06/30/2022 09:13:49 => 60% of predictions performed (412/686 candidate proteins)
INFO 06/30/2022 09:13:51 => 70% of predictions performed (481/686 candidate proteins)
INFO 06/30/2022 09:13:53 => 80% of predictions performed (549/686 candidate proteins)
INFO 06/30/2022 09:13:55 => 90% of predictions performed (619/686 candidate proteins)
INFO 06/30/2022 09:13:57 => 100% of predictions performed
INFO Results:
INFO C:28.9%[S:28.4%,D:0.5%],F:0.8%,M:70.3%,n:1312
INFO 380 Complete BUSCOs (C)
INFO 373 Complete and single-copy BUSCOs (S)
INFO 7 Complete and duplicated BUSCOs (D)
INFO 10 Fragmented BUSCOs (F)
INFO 922 Missing BUSCOs (M)
INFO 1312 Total BUSCO groups searched
INFO ****** Phase 2 of 2, predictions using species specific training ******
INFO ****** Step 1/3, current time: 06/30/2022 09:13:58 ******
INFO Extracting missing and fragmented buscos from the ancestral_variants file...
WARNING The busco id(s) ['EOG092647L7', 'EOG09264L0C', 'EOG09262X7T', 'EOG09264RR2', 'EOG09264RIE', 'EOG09265SHM', 'EOG09262TO9', 'EOG0926448Q', 'EOG092641G3', 'EOG09262O0R', 'EOG092610VI', 'EOG09261V2P', 'EOG0926400M', 'EOG09261DG0', 'EOG09264881', 'EOG0926457R', 'EOG09264331', 'EOG09260LRX', 'EOG092643IE', 'EOG09260JNY', 'EOG09260Z5E', 'EOG0926458I', 'EOG092657UN', 'EOG09264Z3D', 'EOG092629RT', 'EOG092612XD', 'EOG09261QR8', 'EOG09265FTY', 'EOG09260JO5', 'EOG09262V8E', 'EOG092630YS', 'EOG092644ZU', 'EOG092621CK', 'EOG09262FE3', 'EOG092624X0', 'EOG09261NLY', 'EOG09260WCZ', 'EOG09260WGT', 'EOG09262KUJ', 'EOG09262N0U', 'EOG09262SI7', 'EOG092645OU', 'EOG092632TF', 'EOG09261JCQ', 'EOG09264XTW', 'EOG09261XNJ', 'EOG09264XT5', 'EOG09263IT0', 'EOG09264U78', 'EOG09262NB1', 'EOG092610TN', 'EOG09264ABT', 'EOG092610ZY', 'EOG09263Q8J', 'EOG09264FXB', 'EOG09261W2O', 'EOG09263G4R', 'EOG0926510L', 'EOG09264OYZ', 'EOG092640WA', 'EOG092602FH', 'EOG09262UTQ', 'EOG092608WI', 'EOG09260EQD', 'EOG09263WM5', 'EOG09263FAK', 'EOG092605ZA', 'EOG09263FKC', 'EOG09263YBT', 'EOG09263R4M', 'EOG092659DX', 'EOG09261QXC', 'EOG092605VL', 'EOG09263ZSC', 'EOG09262WXK', 'EOG092644DY', 'EOG09260LVD', 'EOG092608AV', 'EOG09261727', 'EOG09261OXD', 'EOG09260EE7', 'EOG092611HB', 'EOG09262N3C', 'EOG09264DIM', 'EOG092604MJ', 'EOG09264HOY', 'EOG09264XPY', 'EOG092626HU', 'EOG092650I8', 'EOG0926315C', 'EOG09260DXP', 'EOG09263BG5', 'EOG092616QN', 'EOG09264OBA', 'EOG09264BIX', 'EOG092644N1', 'EOG09264DT4', 'EOG09262N47', 'EOG09263CLY', 'EOG09265I7S', 'EOG09262W7C', 'EOG092654LJ', 'EOG09260K29', 'EOG09264LH2', 'EOG09265822', 'EOG09263BE5', 'EOG09261OKK', 'EOG09260NE0', 'EOG09262D0D', 'EOG09262SWJ', 'EOG09261P7G', 'EOG0926306O', 'EOG0926140Q', 'EOG09261H5E', 'EOG092615IE', 'EOG092650VI', 'EOG09260BFE', 'EOG09261VI3', 'EOG092635DF', 'EOG09264SUZ', 'EOG09260VYK', 'EOG09264L06', 'EOG09264904', 'EOG09261S7X', 'EOG092631MU', 'EOG09261I0F', 'EOG09260LJ8', 'EOG09262MXH', 'EOG09263RW3', 'EOG09260HPO', 'EOG092658ZO', 'EOG092610KH', 'EOG09263BDA', 'EOG09260OZU', 'EOG09263760', 'EOG09263KVG', 'EOG092615SM', 'EOG092629WA', 'EOG092652Y6', 'EOG09263Z8I', 'EOG0926115P', 'EOG09264398', 'EOG09262J7K', 'EOG09263TQ5', 'EOG09260W8D', 'EOG09260J97', 'EOG092600W1', 'EOG092604ZZ', 'EOG09262D4G', 'EOG09261UPM', 'EOG09264873', 'EOG09261LV7', 'EOG092605T6', 'EOG09260NAN', 'EOG09263EQZ', 'EOG09261V87', 'EOG09260EAZ', 'EOG09262528', 'EOG09260VEY', 'EOG09260MRU', 'EOG09260RRC', 'EOG09260APA', 'EOG092653VU', 'EOG09262YP5', 'EOG09261KRX', 'EOG092628SP', 'EOG09262YQG', 'EOG09263KDI', 'EOG092634ZL', 'EOG09264HU0', 'EOG0926357F', 'EOG09261TEQ', 'EOG09260AQB', 'EOG09265040', 'EOG092629ZN', 'EOG092605KN', 'EOG09262K8C', 'EOG09260OLB', 'EOG09261660', 'EOG092628HC', 'EOG09264G7L', 'EOG0926248P', 'EOG0926112A', 'EOG09260QNB', 'EOG09264ZDZ', 'EOG09262KVB', 'EOG09263SZM', 'EOG092656RK', 'EOG09263RY2', 'EOG0926158Y', 'EOG092654XA', 'EOG0926436T', 'EOG09264W71', 'EOG092619MJ', 'EOG09260TWS', 'EOG09260BYL', 'EOG09262GVX', 'EOG09264PE5', 'EOG09261CXQ', 'EOG09265GXF', 'EOG09260E2O', 'EOG09261476', 'EOG092649VG', 'EOG09262QTY', 'EOG092645U1', 'EOG092652ZZ', 'EOG09260JTZ', 'EOG092625AX', 'EOG09263CUQ', 'EOG09261Y04', 'EOG09263D2P', 'EOG0926049A', 'EOG09265G5K', 'EOG09264EGS', 'EOG09265BG5', 'EOG092619GP', 'EOG09264J8E', 'EOG09260RVQ', 'EOG09262GWQ', 'EOG09261801', 'EOG092658SK', 'EOG092644X6', 'EOG09264JQ1', 'EOG09265HEP', 'EOG092604KQ', 'EOG09260XQV', 'EOG09262ZZ8', 'EOG09265DWT', 'EOG09264U81', 'EOG09260BSW', 'EOG09260OCI', 'EOG09264W1U', 'EOG09262C5Z', 'EOG09262B1U', 'EOG092654VM', 'EOG09262MJW', 'EOG09261404', 'EOG09261I1I', 'EOG09261JR0', 'EOG0926025H', 'EOG092648LP', 'EOG09264R4M', 'EOG09264KTU', 'EOG09264T1U', 'EOG09264O2D', 'EOG09263EDC', 'EOG09264GMT', 'EOG092629FB', 'EOG092643Y5', 'EOG09264I6B', 'EOG092600SK', 'EOG09265KNL', 'EOG09264RJ7', 'EOG09264CA0', 'EOG09262645', 'EOG092617S2', 'EOG09260V5Q', 'EOG09260Z3X', 'EOG09262UB3', 'EOG092603SM', 'EOG0926195C', 'EOG09261QR5', 'EOG0926307V', 'EOG092609O9', 'EOG09264OM7', 'EOG092654KW', 'EOG092634B5', 'EOG092658WY', 'EOG09262POL', 'EOG0926229Z', 'EOG09260274', 'EOG09264UJF', 'EOG09264L3W', 'EOG09260KM4', 'EOG09260JPV', 'EOG0926514P', 'EOG09260XH5', 'EOG09261RWU', 'EOG0926079Q', 'EOG092647CM', 'EOG09265552', 'EOG09262SR7', 'EOG092619RJ', 'EOG092652TN', 'EOG09263A3Y', 'EOG09260H6E', 'EOG092602OP', 'EOG09260RS7', 'EOG09260RGH', 'EOG09263LNF', 'EOG092645LS', 'EOG09263C55', 'EOG09262CBI', 'EOG09263FR7', 'EOG092620IA', 'EOG092608AE', 'EOG09260SJV', 'EOG092600SD', 'EOG09264OXC', 'EOG09260W52', 'EOG09264USX', 'EOG092605VU', 'EOG09264XYD', 'EOG09260DUR', 'EOG092604A0', 'EOG092649QJ', 'EOG092605FC', 'EOG09265DRM', 'EOG09261OXV', 'EOG092659OC', 'EOG092644WX', 'EOG09262U7S', 'EOG092618M2', 'EOG09264P74', 'EOG09262GNE', 'EOG09265F2Y', 'EOG09264LC7', 'EOG09263ULA', 'EOG09261PUF', 'EOG092638XA', 'EOG09264IDN', 'EOG09261HB6', 'EOG09265K4K', 'EOG092608ZS', 'EOG09260KUC', 'EOG092615CC', 'EOG09265QTV', 'EOG0926248W', 'EOG09262WHU', 'EOG0926213Q', 'EOG09264TQ5', 'EOG09260EPS', 'EOG09262GLP', 'EOG092646PE', 'EOG092624RX', 'EOG0926073O', 'EOG09264GQZ', 'EOG092638RC', 'EOG092634MM', 'EOG09261NW2', 'EOG09262R8O', 'EOG09264OML', 'EOG0926506Z', 'EOG09264G1I', 'EOG09260FL2', 'EOG09265CN0', 'EOG092654QZ', 'EOG09261DW8', 'EOG09260XSR', 'EOG0926133I', 'EOG09265G9U', 'EOG09260FZZ', 'EOG092645L9', 'EOG09262Z2S', 'EOG09264ZQC', 'EOG09262OX9', 'EOG09263QUM', 'EOG092619NK', 'EOG09264CP8', 'EOG09260RRN', 'EOG09260GR8', 'EOG09262VOC', 'EOG09264FVQ', 'EOG09260DUW', 'EOG09264BOA', 'EOG09262E4T', 'EOG09265BE5', 'EOG09261OSU', 'EOG0926049S', 'EOG09263UN3', 'EOG092603D0', 'EOG09264WF4', 'EOG092631IR', 'EOG09262BVA', 'EOG092631QQ', 'EOG09263I7I', 'EOG09264C3V', 'EOG09264ZDJ', 'EOG09260NZ8', 'EOG092653LT', 'EOG09260K24', 'EOG09262WSH', 'EOG09264C3N', 'EOG09263EVJ', 'EOG09260PI1', 'EOG0926131E', 'EOG092653YS', 'EOG092656JA', 'EOG09262VVF', 'EOG09260A98', 'EOG09261WVT', 'EOG09263720', 'EOG09260CKC', 'EOG092629WJ', 'EOG09261127', 'EOG092602MO', 'EOG09261UOJ', 'EOG09264W46', 'EOG09264NJ1', 'EOG092610QT', 'EOG0926273Q', 'EOG09265FCK', 'EOG09265ER6', 'EOG09260GKG', 'EOG092608T8', 'EOG09261FM4', 'EOG09262A6N', 'EOG09263CGP', 'EOG092621CP', 'EOG09260779', 'EOG09265B95', 'EOG09261NLR', 'EOG09265RGI', 'EOG09261ZXZ', 'EOG09261K9J', 'EOG09263MR4', 'EOG09263E87', 'EOG09260WG2', 'EOG09262G8Y', 'EOG09260FMW', 'EOG09263G3M', 'EOG09260PJ9', 'EOG092657YR', 'EOG092646WF', 'EOG09264DY4', 'EOG09263HED', 'EOG09260931', 'EOG092636T6', 'EOG09260U6R', 'EOG09263MGE', 'EOG092619L1', 'EOG09260N2T', 'EOG09264IG5', 'EOG09264O9F', 'EOG09265H9T', 'EOG09260TDY', 'EOG09262X74', 'EOG0926213Z', 'EOG09261I8J', 'EOG09263GSR', 'EOG09262KJA', 'EOG09264O6J', 'EOG09262P1W', 'EOG09261N20', 'EOG09262QS5', 'EOG092603EH', 'EOG09260Y2Q', 'EOG09260XL0', 'EOG09263KZJ', 'EOG09264A2D', 'EOG09263K05', 'EOG092604S1', 'EOG09265M98', 'EOG09265PWR', 'EOG09260EOI', 'EOG09265LEG', 'EOG09264272', 'EOG09261G1Y', 'EOG09261666', 'EOG09264T0J', 'EOG09261IEV', 'EOG092605QM', 'EOG09261Q18', 'EOG092606WZ', 'EOG09261JVS', 'EOG09265CCT', 'EOG09261EY9', 'EOG09262QJW', 'EOG09260UXC', 'EOG0926077L', 'EOG09265KQ4', 'EOG09264AWW', 'EOG092600T9', 'EOG0926539T', 'EOG09262K67', 'EOG09263CG4', 'EOG09263DQH', 'EOG09265D7J', 'EOG09262BCZ', 'EOG092614UB', 'EOG09261EM7', 'EOG09264OQ8', 'EOG0926423H', 'EOG09264NEF', 'EOG092620CR', 'EOG092630MJ', 'EOG092651HW', 'EOG09262SMG', 'EOG09264E6Z', 'EOG092641M3', 'EOG09262A8G', 'EOG092601KZ', 'EOG092642I5', 'EOG09265DDU', 'EOG09263JTO', 'EOG09263OZR', 'EOG09262XRU', 'EOG0926534P', 'EOG09260DFH', 'EOG09260VCG', 'EOG09260VTN', 'EOG092618J9', 'EOG09264RJL', 'EOG09262IY3', 'EOG09264S3E', 'EOG09264V2U', 'EOG09262NNS', 'EOG09260E8K', 'EOG09261MOX', 'EOG09260M87', 'EOG09261CM0', 'EOG09262BHE', 'EOG09261W90', 'EOG09260WUA', 'EOG09260S2Z', 'EOG09264TJN', 'EOG092646CB', 'EOG09263RDD', 'EOG09264RBX', 'EOG09263OCO', 'EOG09263F11', 'EOG092648VW', 'EOG09264VRO', 'EOG09261ACJ', 'EOG09261OLD', 'EOG09264ENO', 'EOG09262PPU', 'EOG092612LP', 'EOG09263Z41', 'EOG09263X22', 'EOG09261N64', 'EOG09261OIA', 'EOG09262VPD', 'EOG09262LQT', 'EOG092609YT', 'EOG092627F1', 'EOG09260FFP', 'EOG092602I6', 'EOG0926477X', 'EOG09261JWS', 'EOG09265GYD', 'EOG09264XUV', 'EOG092655L0', 'EOG092608L0', 'EOG0926419M', 'EOG09264PD5', 'EOG09260LI6', 'EOG09261IEH', 'EOG09261UWT', 'EOG09262CXO', 'EOG09262LI4', 'EOG09265KPR', 'EOG09264T3S', 'EOG092628FW', 'EOG092639H5', 'EOG09261FH7', 'EOG09262Q6S', 'EOG0926505R', 'EOG09264B3O', 'EOG092653SU', 'EOG09260S5R', 'EOG0926004Z', 'EOG092603KJ', 'EOG09263OWL', 'EOG09264DJ8', 'EOG09264T8I', 'EOG09264SQJ', 'EOG09262TEV', 'EOG092653NM', 'EOG09260VTA', 'EOG0926092K', 'EOG09262X01', 'EOG092649XV', 'EOG09262V3O', 'EOG092635YY', 'EOG09261YLQ', 'EOG09262FJB', 'EOG0926499W', 'EOG09260JJW', 'EOG092607OQ', 'EOG09260ERO', 'EOG09262V9N', 'EOG09260B3X', 'EOG09265I60', 'EOG09264VC6', 'EOG09261FAX', 'EOG09262XMN', 'EOG09264R0D', 'EOG092638CT', 'EOG09265EKJ', 'EOG09261SS1', 'EOG092602UY', 'EOG092616YZ', 'EOG09263RVR', 'EOG09261ICI', 'EOG09263UWJ', 'EOG09262N10', 'EOG09261ZJR', 'EOG09260KCB', 'EOG09263J3H', 'EOG09263FGN', 'EOG09264ZWF', 'EOG09264W7W', 'EOG09262UAS', 'EOG0926310O', 'EOG09261LPY', 'EOG0926071Q', 'EOG09262E3Q', 'EOG09262DPL', 'EOG09260AZK', 'EOG09264KDO', 'EOG092617RY', 'EOG09264HN5', 'EOG09260QVP', 'EOG09260375', 'EOG09264X31', 'EOG09264G04', 'EOG092609JB', 'EOG092606AJ', 'EOG092606AD', 'EOG09260NNR', 'EOG09261W1K', 'EOG092613UB', 'EOG09264I14', 'EOG09260H81', 'EOG09261XNU', 'EOG09260NWN', 'EOG09264IQ7', 'EOG09264CP4', 'EOG09265RGS', 'EOG09260GIX', 'EOG092658X5', 'EOG0926251E', 'EOG092621GA', 'EOG09263LR1', 'EOG09263E9V', 'EOG09262KZ3', 'EOG09262E7W', 'EOG092600NM', 'EOG09260RNZ', 'EOG09264IV9', 'EOG09262PMC', 'EOG09260R9L', 'EOG092631UM', 'EOG09260NC6', 'EOG09264P4J', 'EOG09260KGS', 'EOG09261B3Q', 'EOG09264ZXA', 'EOG09265B1X', 'EOG09262F22', 'EOG09263OAE', 'EOG09261V9P', 'EOG09261Q5L', 'EOG09264MGU', 'EOG09264KK7', 'EOG092611G7', 'EOG092644TW', 'EOG09261D4D', 'EOG09265PQX', 'EOG09263X1F', 'EOG09260KDB', 'EOG09263ZW6', 'EOG09263690', 'EOG09260NHB', 'EOG09261J0P', 'EOG09262JZK', 'EOG09263L9T', 'EOG092656CM', 'EOG09263GIG', 'EOG09262516', 'EOG09263IMF', 'EOG09265FTN', 'EOG09262WQX', 'EOG09265K60', 'EOG09264SSI', 'EOG09263OD3', 'EOG09265GGX', 'EOG09264D7Y', 'EOG09260T4S', 'EOG09260WUS', 'EOG09261KHB', 'EOG09261EMF', 'EOG092617AN', 'EOG09265PUI', 'EOG09263K45', 'EOG0926489S', 'EOG09262HP3', 'EOG09263ZBJ', 'EOG092619VG', 'EOG09261T98', 'EOG09260NXC', 'EOG09261ZFN', 'EOG09260MBW', 'EOG09265313', 'EOG09265NHW', 'EOG09260289', 'EOG092624KK', 'EOG09263S2P', 'EOG09264719', 'EOG092624UF', 'EOG09261DRB', 'EOG09261MMR', 'EOG09261VD2', 'EOG09265JA7', 'EOG092613R2', 'EOG09262914', 'EOG09260KNI', 'EOG0926369X', 'EOG09265A08', 'EOG09264A8D', 'EOG09260J8F', 'EOG09260KNR', 'EOG09262TUR', 'EOG09264NC7', 'EOG09264XKX', 'EOG092621ZV', 'EOG092600S9', 'EOG09263CAC', 'EOG09262JRP', 'EOG09260NHN', 'EOG09260B65', 'EOG092643NE', 'EOG09262KXK', 'EOG09265AT5', 'EOG0926431P', 'EOG092620E4', 'EOG092605OK', 'EOG09260JDM', 'EOG092652KR', 'EOG09260LS9', 'EOG09260UA2', 'EOG092641A6', 'EOG09261RWJ', 'EOG09264VZ7', 'EOG09260WU6', 'EOG092641UM', 'EOG0926354S', 'EOG09263M8W', 'EOG09263PWF', 'EOG092658NW', 'EOG092612MY', 'EOG092632WW', 'EOG0926390Q', 'EOG09263QPR', 'EOG09265HP0', 'EOG09263OQH', 'EOG092628LW', 'EOG09263U08', 'EOG092612CC', 'EOG09263AZP', 'EOG092620U5', 'EOG09265OQH', 'EOG09261I1G', 'EOG09260SAH', 'EOG092643QM', 'EOG09263KRO', 'EOG09263817', 'EOG09263MEM', 'EOG09265BJ3', 'EOG09264BWL', 'EOG09263WB5', 'EOG09263X0V', 'EOG09262QRH', 'EOG09264YEG', 'EOG09264PDD', 'EOG092651BA', 'EOG092608WU', 'EOG09264MN3', 'EOG092643JW', 'EOG09262N5O', 'EOG092646VF', 'EOG09261YRA', 'EOG092633QB', 'EOG09261IOS', 'EOG09260C2V', 'EOG09263U71', 'EOG09261225', 'EOG092617RN', 'EOG09261AH9', 'EOG09263SFX', 'EOG0926129I', 'EOG09264JHE', 'EOG09262A65', 'EOG092619EA', 'EOG092648O6', 'EOG092643VB', 'EOG09260DP1', 'EOG09264YKY', 'EOG09263EBB', 'EOG092641K1', 'EOG0926388H', 'EOG09260GF5', 'EOG09263HD8', 'EOG09262PZ9', 'EOG092634G9', 'EOG09265E8A', 'EOG092644O2', 'EOG09264PK5', 'EOG092652YI', 'EOG09264FYY', 'EOG09262387', 'EOG09265IT6', 'EOG09262X8R', 'EOG09264JK1', 'EOG09262M0W', 'EOG092645G0', 'EOG092648K5', 'EOG09264V30', 'EOG09262MFL', 'EOG092657H8', 'EOG09260SRF', 'EOG09265BTA', 'EOG09260TPT', 'EOG09264JW6', 'EOG09264K2W', 'EOG092629U5', 'EOG09262E98', 'EOG09265ANI', 'EOG0926142Y', 'EOG092645QN', 'EOG09262ILV', 'EOG09261B6Y', 'EOG09261ZPW', 'EOG09264SET', 'EOG092646C6', 'EOG09265JNA', 'EOG092636Y6', 'EOG092631ML', 'EOG09263W7L', 'EOG092627XA', 'EOG09261A3K', 'EOG092638EN', 'EOG09264RW6', 'EOG0926009O', 'EOG092651FJ', 'EOG09260HMA', 'EOG092648XW', 'EOG092646EZ', 'EOG092655SO', 'EOG09261YV6', 'EOG092634B1', 'EOG09264IIZ', 'EOG09260075', 'EOG092653KS', 'EOG09263MNN', 'EOG09263Y3L', 'EOG09260S3L', 'EOG09262PAY', 'EOG09261VUC', 'EOG09262M7B', 'EOG09262H1X', 'EOG09262YAU', 'EOG09262DC1', 'EOG09264HX6', 'EOG09261G92', 'EOG09263GUT', 'EOG0926506U', 'EOG09261RFF', 'EOG092624SJ', 'EOG092621F2', 'EOG09260N53', 'EOG09263J6Z', 'EOG092656IY', 'EOG09261PJZ', 'EOG09265JVH', 'EOG09260DBG', 'EOG09262341', 'EOG09263E5F', 'EOG09261V03', 'EOG09264XVU', 'EOG09263KB4', 'EOG09261HQU', 'EOG09262JWJ', 'EOG0926074Y', 'EOG092635ST', 'EOG0926115V', 'EOG09265EOF', 'EOG092635SS', 'EOG092614DJ', 'EOG09264DOU', 'EOG092600T4', 'EOG092626EQ', 'EOG09264LKR', 'EOG09262PIH', 'EOG09260ETR', 'EOG09264KIV', 'EOG09260OE9', 'EOG09261WJ8', 'EOG09265C25', 'EOG09264YIJ', 'EOG09263WZ2', 'EOG09262M2J', 'EOG09263RF8', 'EOG092603YJ', 'EOG092610LQ', 'EOG09261FAB', 'EOG09260KWP', 'EOG09262IP2', 'EOG0926137U', 'EOG09264F1U', 'EOG092648Q0', 'EOG09261G4Z', 'EOG09261B18', 'EOG09261N2L', 'EOG09260HS3', 'EOG092649VA', 'EOG0926591L', 'EOG09264XJC', 'EOG09264B2P', 'EOG09264DMU', 'EOG09264NNY', 'EOG092653O3', 'EOG092608RH', 'EOG092640BS', 'EOG09263JW5', 'EOG09262CDO', 'EOG092655M5', 'EOG09260EPQ', 'EOG09264CND', 'EOG09261F73', 'EOG09264LJU', 'EOG09263A5D', 'EOG09263DFA', 'EOG0926312D', 'EOG09260FKU', 'EOG09263FTE', 'EOG0926347W', 'EOG09264441', 'EOG09262CUO', 'EOG09261MPU', 'EOG0926587S'] were not found in the ancestral_variants file
INFO Running tblastn, writing output to /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/busco/run_saccharomyces/blast_output/tblastn_saccharomyces_missing_and_frag_rerun.tsv...
INFO [tblastn] Warning: [tblastn] Query is Empty!
INFO Getting coordinates for candidate regions...
INFO ****** Step 2/3, current time: 06/30/2022 09:13:58 ******
INFO Training Augustus using Single-Copy Complete BUSCOs:
INFO 06/30/2022 09:13:58 => Converting predicted genes to short genbank files...
INFO 06/30/2022 09:14:04 => All files converted to short genbank files, now running the training scripts...
INFO Pre-Augustus scaffold extraction...
INFO Re-running Augustus with the new metaparameters, number of target BUSCOs: 932
INFO 06/30/2022 09:14:21 => 0% of predictions performed (0 to be done)
INFO 06/30/2022 09:14:21 => 100% of predictions performed
INFO Extracting predicted proteins...
INFO ****** Step 3/3, current time: 06/30/2022 09:14:21 ******
INFO Running HMMER to confirm orthology of predicted proteins:
INFO 06/30/2022 09:14:21 => 0% of predictions performed (0 to be done)
INFO 06/30/2022 09:14:21 => 100% of predictions performed
INFO Results:
INFO C:28.9%[S:28.4%,D:0.5%],F:0.8%,M:70.3%,n:1312
INFO 380 Complete BUSCOs (C)
INFO 373 Complete and single-copy BUSCOs (S)
INFO 7 Complete and duplicated BUSCOs (D)
INFO 10 Fragmented BUSCOs (F)
INFO 922 Missing BUSCOs (M)
INFO 1312 Total BUSCO groups searched
INFO BUSCO analysis done with WARNING(s). Total running time: 711.759425163269 seconds
INFO Results written in /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/busco/run_saccharomyces/
INFO ****************** Start a BUSCO 2.0 analysis, current time: 06/30/2022 09:14:41 ******************
INFO The lineage dataset is: dikarya_odb9 (eukaryota)
INFO Mode is: proteins
INFO To reproduce this run: python /venv/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/busco_augustus.proteins.fasta -o saccharomyces -l /opt/databases/dikarya/ -m proteins -c 4 -sp anidulans
INFO Check dependencies...
INFO Check input file...
INFO Temp directory is ./tmp/
INFO Running HMMER on the proteins:
INFO 06/30/2022 09:14:42 => 0% of predictions performed (1312 to be done)
INFO 06/30/2022 09:14:44 => 10% of predictions performed (133/1312 candidate proteins)
INFO 06/30/2022 09:14:47 => 20% of predictions performed (264/1312 candidate proteins)
INFO 06/30/2022 09:14:49 => 30% of predictions performed (396/1312 candidate proteins)
INFO 06/30/2022 09:14:53 => 40% of predictions performed (525/1312 candidate proteins)
INFO 06/30/2022 09:14:57 => 50% of predictions performed (658/1312 candidate proteins)
INFO 06/30/2022 09:15:01 => 60% of predictions performed (788/1312 candidate proteins)
INFO 06/30/2022 09:15:06 => 70% of predictions performed (921/1312 candidate proteins)
INFO 06/30/2022 09:15:11 => 80% of predictions performed (1050/1312 candidate proteins)
INFO 06/30/2022 09:15:17 => 90% of predictions performed (1182/1312 candidate proteins)
INFO 06/30/2022 09:15:22 => 100% of predictions performed
INFO Results:
INFO C:28.3%[S:28.2%,D:0.1%],F:0.0%,M:71.7%,n:1312
INFO 371 Complete BUSCOs (C)
INFO 370 Complete and single-copy BUSCOs (S)
INFO 1 Complete and duplicated BUSCOs (D)
INFO 0 Fragmented BUSCOs (F)
INFO 941 Missing BUSCOs (M)
INFO 1312 Total BUSCO groups searched
INFO BUSCO analysis done. Total running time: 42.64242601394653 seconds
INFO Results written in /Users/jon/test-predict_6004e1e9-1eda-4469-a5b0-c06c104a1135/annotate/predict_misc/busco_proteins/run_saccharomyces/
If it helps, I just pushed an update where the dependencies, versions, and full paths are printed to the logfile, ie:
[06/30/22 10:18:12]: /venv/bin/funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 4 --species Awesome testicus
[06/30/22 10:18:12]: OS: Debian GNU/Linux 10, 4 cores, ~ 8 GB RAM. Python: 3.8.13
[06/30/22 10:18:12]: Running funannotate v1.8.12
[06/30/22 10:18:12]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
[06/30/22 10:18:12]: exonerate version=exonerate 2.4.0 path=/venv/bin/exonerate
[06/30/22 10:18:12]: diamond version=2.0.15 path=/venv/bin/diamond
[06/30/22 10:18:12]: tbl2asn version=no way to determine, likely 25.X path=/venv/bin/tbl2asn
[06/30/22 10:18:12]: bedtools version=bedtools v2.30.0 path=/venv/bin/bedtools
[06/30/22 10:18:12]: augustus version=3.3.2 path=/usr/bin/augustus
[06/30/22 10:18:12]: etraining version=NA path=/usr/bin/etraining
[06/30/22 10:18:12]: tRNAscan-SE version=2.0.9 (July 2021) path=/venv/bin/tRNAscan-SE
[06/30/22 10:18:12]: bam2hints version=NA path=/usr/bin/bam2hints
[06/30/22 10:18:12]: minimap2 version=2.24-r1122 path=/venv/bin/minimap2
[06/30/22 10:18:12]: $AUGUSTUS_CONFIG_PATH=/usr/share/augustus/config
[06/30/22 10:18:13]: {'augustus': 1, 'hiq': 2, 'genemark': 0, 'pasa': 6, 'codingquarry': 0, 'snap': 1, 'glimmerhmm': 1, 'proteins': 1, 'transcripts': 1}
[06/30/22 10:18:13]: Skipping CodingQuarry as no --rna_bam passed
You can upgrade with pip, ie:
python -m pip install git+https://github.com/nextgenusfs/funannotate.git
Ultimately I think this is a problem with augustus, not entirely sure what is the issue. It is outputting data like its Augustus 3.4 which is incompatible with the internal BUSCO script in funannotate, so I've not seen any version < 3.4 fail like this before.
I tried upgrading as you suggested and got a new error:
#########################################################
Running funannotate clean
unit testing: minimap2 mediated assembly duplications
Downloading: https://osf.io/8pjbe/download?version=1 Bytes: 252076
8192 [3.25%]16384 [6.50%]24576 [9.75%]32768 [13.00%]40960 [16.25%]49152 [19.50%]57344 [22.75%]65536 [26.00%]73728 [29.25%]81920 [32.50%]90112 [35.75%]98304 [39.00%]106496 [42.25%]114688 [45.50%]122880 [48.75%]131072 [52.00%]139264 [55.25%]147456 [58.50%]155648 [61.75%]163840 [65.00%]172032 [68.25%]180224 [71.50%]188416 [74.75%]196608 [78.00%]204800 [81.25%]212992 [84.50%]221184 [87.74%]229376 [90.99%]237568 [94.24%]245760 [97.49%]252076 [100.00%]Traceback (most recent call last):
File "/hpc/home/idm7/miniconda3/envs/funannotate/bin/funannotate", line 8, in
Oh, that's my fault, I'll fix.
Can you just run funannotate test -t predict --cpus N
so it just runs the predict test.
latest commit should fix this error, thanks for reporting.
funannotate test -t predict --cpus 10
The test completes, but still only finds 175 BUSCO loci
######################################################### Running
funannotate predict
unit testing Downloading: https://osf.io/te2pf/download?version=1 Bytes: 1489808 [Jul 02 02:46 PM]: OS: CentOS Stream 8, 46 cores, ~ 230 GB RAM. Python: 3.8.12 [Jul 02 02:46 PM]: Running funannotate v1.8.12 [Jul 02 02:46 PM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction. [Jul 02 02:46 PM]: Skipping CodingQuarry as no --rna_bam passed [Jul 02 02:46 PM]: Parsed training data, run ab-initio gene predictors as follows: [4mProgram Training-Method[0m augustus pretrained
glimmerhmm busco
snap busco
[Jul 02 02:46 PM]: Loading genome assembly and parsing soft-masked repetitive sequences [Jul 02 02:46 PM]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked [Jul 02 02:46 PM]: Mapping 1,065 proteins to genome using diamond and exonerate [Jul 02 02:46 PM]: Found 1,505 preliminary alignments with diamond in 0:00:01 --> generated FASTA files for exonerate in 0:00:00 [Jul 02 02:46 PM]: Exonerate finished in 0:00:21: found 1,270 alignments [Jul 02 02:46 PM]: Running BUSCO to find conserved gene models for training ab-initio predictors [Jul 02 02:51 PM]: 175 valid BUSCO predictions found, validating protein sequences [Jul 02 02:52 PM]: 175 BUSCO predictions validated [Jul 02 02:52 PM]: Running Augustus gene prediction using saccharomyces parameters [Jul 02 02:53 PM]: 1,485 predictions from Augustus [Jul 02 02:53 PM]: Pulling out high quality Augustus predictions [Jul 02 02:53 PM]: Found 371 high quality predictions from Augustus (>90% exon evidence) [Jul 02 02:53 PM]: Running SNAP gene prediction, using training data: annotate/predict_misc/busco.final.gff3 [Jul 02 02:54 PM]: 1,519 predictions from SNAP [Jul 02 02:54 PM]: Running GlimmerHMM gene prediction, using training data: annotate/predict_misc/busco.final.gff3 [Jul 02 02:55 PM]: 1,586 predictions from GlimmerHMM [Jul 02 02:55 PM]: Summary of gene models passed to EVM (weights): [Jul 02 02:55 PM]: EVM: partitioning input to ~ 35 genes per partition using min 1500 bp interval [Jul 02 02:57 PM]: Converting to GFF3 and collecting all EVM results [4mSource Weight Count[0m Augustus 1 1325 Augustus HiQ 2 372
GlimmerHMM 1 1586 snap 1 1519 Total - 4802 [Jul 02 02:57 PM]: 1,683 total gene models from EVM [Jul 02 02:57 PM]: Generating protein fasta files from 1,683 EVM models [Jul 02 02:57 PM]: now filtering out bad gene models (< 50 aa in length, transposable elements, etc). [Jul 02 02:57 PM]: Found 131 gene models to remove: 0 too short; 0 span gaps; 131 transposable elements [Jul 02 02:57 PM]: 1,552 gene models remaining [Jul 02 02:57 PM]: Predicting tRNAs [Jul 02 02:57 PM]: 112 tRNAscan models are valid (non-overlapping) [Jul 02 02:57 PM]: Generating GenBank tbl annotation file [Jul 02 02:57 PM]: Collecting final annotation files for 1,664 total gene models [Jul 02 02:57 PM]: Converting to final Genbank format [Jul 02 02:57 PM]: Funannotate predict is finished, output files are in the annotate/predict_results folder [Jul 02 02:57 PM]: Your next step might be functional annotation, suggested commands:Run InterProScan (manual install): funannotate iprscan -i annotate -c 10
Run antiSMASH (optional): funannotate remote -i annotate -m antismash -e youremail@server.edu
Annotate Genome: funannotate annotate -i annotate --cpus 10 --sbt yourSBTfile.txt
[Jul 02 02:57 PM]: Training parameters file saved: annotate/predict_results/saccharomyces.parameters.json [Jul 02 02:57 PM]: Add species parameters to database:
funannotate species -s saccharomyces -a annotate/predict_results/saccharomyces.parameters.json
CMD: funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --augustus_species saccharomyces --cpus 10 --species Awesome testicus ######################################################### ######################################################### SUCCESS:
funannotate predict
test complete. #########################################################
The error you commented on earlier,
INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092603EH.faa.1): INFO [hmmersearch] Line 2: illegal character %
seems to be happening becuase Augustus is appending extra stuff to some of the protein sequences. See for example
">g1[CP022974.1:280548-281798] MVSSLPKESQAELQLFQNEINAANPSDFLQFSANYFNKRLEQQRAFLKAREPEFKAKNIVLFPEPEESFSRPQSAQSQSRSRSSVMFKSPFVNEDPHSNVFKSGFNLDPHEQDTHQQAQEEQQHTREKTSTPPLPMHFNAQRRTSVSGETLQPNNFDDWTPDHYKEKSEQQLQRLEKSIRNNFLFNKLDSDSKRLVINCLEEKSVPKGATIIKQGDQGDYFYVVEKGTVDFYVNDNKVNSSGPGSSFGELALMYNSPRAATVVATSDCLLWALDRLTFRKILLGSSFKKRLMYDDLLKSMPVLKSLTTYDRAKLADALDTKIYQPGETIIREGDQGENFYLIEYGAVDVSKKGQGVINKLKDHDYFGEVALLNDLPRQATVTATKRTKVATLGKSGFQRLLGPAVDVLKLNDPTRHEvidence%CDSCDS5'UTR3'UTRhintincompatibleRM:"
And in the logfile it lists Augustus v3.3.3?
So try a different version of Augustus. It needs to be less than v3.4. I thought 3.3.3 was fine but apparently not on your system. Most if not all of the versions on bioconda lately are not compiled properly and won't work with BUSCO. I don't know what changed in bioconda that it stopped working.
I am working to set up an alternative version of Augustus now.
The error I am seeing with my existing Augustus 3.3.3 compiled without bioconda seems to be in how the /predicted_genes files (eg., EOG09260A98.out.1) are being converted to /extracted_proteins files (e.g., EOG09260A98.faa.1). Do you happen to know if this is performed by Augustus or one of its associated scripts? If so, I will post an issue in the Augustus github repository.
EDIT: Ok, looks like it is BUSCO that extracts the sequences.
Yeah BUSCO extracts but Augustus format changed in 3.4. I haven't had chance to update that script. I want to remove the internal BUSCO and use the conda most recent package, however there are too many dependency conflicts and I cannot get a conda environment to build properly. So I haven't changed it yet. The Augustus proteinprofile issue though will continue to be an issue.
I still have no idea what is the problem with my Augustus 3.3.3, but I have solved the immediate issue by modifying funannotate-BUSCO2.py
. I inserted two lines into the function _extract
to account for the format of the predicted_genes .out files:
1606 elif line.startswith('# Evidence'):
1607 check = 0
Attaching the full edited function as well: _extract.py.txt
Thanks for your help thinking through this!
Great thanks. I will try to find some time to test if this works with other versions of Augustus.
Thanks @IanDMedeiros -- I'll incorporate your change above as I think will work and allow support of augustus v3.4 in the current codebase. I actually ended up re-writing busco as a simplified version for what funannotate uses as we need a module that can be installed/solved with conda -- the BUSCOv5 won't work and has many new dependencies. It will be easier to maintain as a repo outside of funannotate. There are a few things left to do, but repo is here: https://github.com/nextgenusfs/buscolite. If you have a chance to test with your augustus 3.3.3 version that would be helpful, you should be able to simply install into funannotate environment with pip.
FYI - augustus=3.5.0 now available now in bioconda https://github.com/bioconda/bioconda-recipes/pull/37364 can we test this is working too?
[Oct 10 05:55 PM]: OS: Ubuntu 20.04, 24 cores, ~ 231 GB RAM. Python: 3.8.19
[Oct 10 05:55 PM]: Running funannotate v1.8.17
[Oct 10 05:55 PM]: GeneMark not found and $GENEMARK_PATH environmental variable missing. Will skip GeneMark ab-initio prediction.
[Oct 10 05:55 PM]: Skipping CodingQuarry as no --rna_bam passed
[Oct 10 05:55 PM]: Parsed training data, run ab-initio gene predictors as follows:
Program Training-Method
augustus pretrained
glimmerhmm busco
snap busco
[Oct 10 05:56 PM]: Loading genome assembly and parsing soft-masked repetitive sequences
[Oct 10 05:56 PM]: Genome loaded: 32 scaffolds; 36,363,344 bp; 7.67% repeats masked
/home/user/anaconda3/envs/funannotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-p2g.py:14: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
from pkg_resources import parse_version
[Oct 10 05:56 PM]: Mapping 558,971 proteins to genome using diamond and exonerate
[Oct 10 05:58 PM]: Found 293,238 preliminary alignments with diamond in 0:01:50 --> generated FASTA files for exonerate in 0:00:28
Progress: 293238 complete, 0 failed, 0 remaining
[Oct 10 06:17 PM]: Exonerate finished in 0:18:49: found 1,877 alignments
[Oct 10 06:17 PM]: Running BUSCO to find conserved gene models for training ab-initio predictors
[Oct 10 06:17 PM]: 0 valid BUSCO predictions found, validating protein sequences
Traceback (most recent call last):
File "/home/user/anaconda3/envs/funannotate/bin/funannotate", line 10, in
Busco is failing to run in my case while running the predict command.
Are you using the latest release?
funannotate v1.8.11
.Describe the bug funannotate test is failing at the predict step with error "Not enough gene models 175 to train Augustus (200 required), exiting". Appears to be identical error to #552. I am also receiving similar errors with real data. End of #552 discussion suggested that the error might be related to GeneMark, but I am having troubled setting up GeneMark-ES so wouldn't the program just run without it?
What command did you issue?
funannotate test -t all --cpus 10
What probably isn't the problem, based on what I have tried so far Bad Augustus installation. I was getting an Augustus error even earlier in funannotate test, so I replaced the Augustus that was installed by mamba with one (v. 3.3.3) already available on our system. AUGUSTUS_CONFIG_PATH permissions. Ran chmod 777 $AUGUSTUS_CONFIG_PATH/species and error did not go away. Multithreading. Tried with --cpus 1 and 10 ... same error.
Logfiles funannotate-predict.log `[06/29/22 21:19:42]: /hpc/home/idm7/miniconda3/envs/annotate/bin/funannotate predict -i test.softmasked.fa --protein_evidence protein.evidence.fasta -o annotate --cpus 10 --species Awesome busco
[06/29/22 21:19:42]: OS: CentOS Stream 8, 46 cores, ~ 230 GB RAM. Python: 3.8.12 [06/29/22 21:19:42]: Running funannotate v1.8.11 [06/29/22 21:19:42]: GeneMark path: /hpc/group/bio1/ian/envs/funannotate/gmes_petap [06/29/22 21:19:42]: Full path to gmes_petap.pl: /hpc/group/bio1/ian/envs/funannotate/gmes_petap/gmes_petap.pl [06/29/22 21:19:42]: GeneMark appears to be functional? False [06/29/22 21:19:43]: {'augustus': 1, 'hiq': 2, 'genemark': 0, 'pasa': 6, 'codingquarry': 0, 'snap': 1, 'glimmerhmm': 1, 'proteins': 1, 'transcripts': 1} [06/29/22 21:19:43]: Skipping CodingQuarry as no --rna_bam passed [06/29/22 21:19:43]: {'augustus': 'busco', 'snap': 'busco', 'glimmerhmm': 'busco'} [06/29/22 21:19:43]: Parsed training data, run ab-initio gene predictors as follows: [06/29/22 21:19:44]: {'augustus': 1, 'hiq': 2, 'genemark': 0, 'pasa': 6, 'codingquarry': 0, 'snap': 1, 'glimmerhmm': 1, 'proteins': 1, 'transcripts': 1} [06/29/22 21:19:45]: Loading genome assembly and parsing soft-masked repetitive sequences [06/29/22 21:19:45]: Genome loaded: 6 scaffolds; 3,776,588 bp; 19.75% repeats masked
06/29/22 21:20:12: Running BUSCO to find conserved gene models for training ab-initio predictors 06/29/22 21:20:12: /hpc/home/idm7/miniconda3/envs/annotate/bin/python /hpc/home/idm7/miniconda3/envs/annotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/genome.softmasked.fa -m genome --lineage /hpc/group/bio1/ian/envs/funannotate_db/dikarya -o awesome_busco -c 10 --species anidulans -f --local_augustus /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/ab_initio_parameters/augustus [06/29/22 21:25:12]: 175 valid BUSCO predictions found, validating protein sequences [06/29/22 21:26:04]: 175 BUSCO predictions validated [06/29/22 21:26:04]: Not enough gene models 175 to train Augustus (200 required), exiting
busco.log
INFO ** Start a BUSCO 2.0 analysis, current time: 06/29/2022 21:20:12 ** INFO The lineage dataset is: dikarya_odb9 (eukaryota) INFO Mode is: genome INFO Maximum number of regions limited to: 3 INFO To reproduce this run: python /hpc/home/idm7/miniconda3/envs/annotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/genome.softmasked.fa -o awesome_busco -l /hpc/group/bio1/ian/envs/funannotate_db/dikarya/ -m genome -c 10 -sp anidulans INFO Check dependencies... INFO Check input file... INFO Temp directory is ./tmp/INFO ** Phase 1 of 2, initial predictions ** INFO ** Step 1/3, current time: 06/29/2022 21:20:12 ** INFO Create blast database... INFO [makeblastdb] Building a new DB, current time: 06/29/2022 21:20:12 INFO [makeblastdb] New DB name: /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/tmp/awesome_busco_4188679581 INFO [makeblastdb] New DB title: /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/genome.softmasked.fa INFO [makeblastdb] Sequence type: Nucleotide INFO [makeblastdb] Keep Linkouts: T INFO [makeblastdb] Keep MBits: T INFO [makeblastdb] Maximum file size: 1000000000B INFO [makeblastdb] Adding sequences from FASTA; added 6 sequences in 0.0434968 seconds. INFO Running tblastn, writing output to /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/blast_output/tblastn_awesome_busco.tsv... INFO ** Step 2/3, current time: 06/29/2022 21:20:21 ** INFO Getting coordinates for candidate regions... INFO Pre-Augustus scaffold extraction... INFO Running Augustus prediction using anidulans as species: INFO [augustus] Please find all logs related to Augustus here: /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/augustus.log INFO 06/29/2022 21:20:21 => 0% of predictions performed (743 to be done) INFO 06/29/2022 21:20:56 => 10% of predictions performed (75/743 candidate regions) INFO 06/29/2022 21:21:24 => 20% of predictions performed (149/743 candidate regions) INFO 06/29/2022 21:22:04 => 30% of predictions performed (223/743 candidate regions) INFO 06/29/2022 21:22:39 => 40% of predictions performed (298/743 candidate regions) INFO 06/29/2022 21:23:01 => 50% of predictions performed (372/743 candidate regions) INFO 06/29/2022 21:23:21 => 60% of predictions performed (446/743 candidate regions) INFO 06/29/2022 21:23:38 => 70% of predictions performed (521/743 candidate regions) INFO 06/29/2022 21:23:55 => 80% of predictions performed (596/743 candidate regions) INFO 06/29/2022 21:24:08 => 90% of predictions performed (669/743 candidate regions) INFO 06/29/2022 21:24:20 => 100% of predictions performed INFO Extracting predicted proteins... INFO ** Step 3/3, current time: 06/29/2022 21:24:49 ** INFO Running HMMER to confirm orthology of predicted proteins: INFO 06/29/2022 21:24:49 => 0% of predictions performed (602 to be done) INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092600SD.faa.1): INFO [hmmersearch] Line 2: illegal character % INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092603EH.faa.1): INFO [hmmersearch] Line 2: illegal character % INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092600T4.faa.1): INFO [hmmersearch] Line 2: illegal character % INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092600X0.faa.1): INFO [hmmersearch] Line 2: illegal character % INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG0926009O.faa.1): INFO [hmmersearch] Line 2: illegal character % INFO [hmmersearch] Parse failed (sequence file /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/augustus_output/extracted_proteins/EOG092602I6.faa.3): INFO [hmmersearch] Line 2: illegal character %
<This goes on for many lines, apparently through all the BUSCO loci. Omitting here for space.>
INFO 06/29/2022 21:24:58 => 100% of predictions performed INFO Results: INFO C:13.5%[S:13.3%,D:0.2%],F:0.1%,M:86.4%,n:1312 INFO 177 Complete BUSCOs (C) INFO 175 Complete and single-copy BUSCOs (S) INFO 2 Complete and duplicated BUSCOs (D) INFO 1 Fragmented BUSCOs (F) INFO 1134 Missing BUSCOs (M) INFO 1312 Total BUSCO groups searched
INFO ** Phase 2 of 2, predictions using species specific training ** INFO ** Step 1/3, current time: 06/29/2022 21:25:00 ** INFO Extracting missing and fragmented buscos from the ancestral_variants file... WARNING The busco id(s) ['EOG0926457R', 'EOG09264XJC', 'EOG0926129I', 'EOG09265F2Y', 'EOG09262N3C', 'EOG092602UY', 'EOG09264R4M', 'EOG09264P3R', 'EOG09264HX6', 'EOG09260OCI', 'EOG09261J0P', 'EOG092610TN', 'EOG09264BOA', 'EOG09261DRB', 'EOG092608L0', 'EOG09261FAX', 'EOG09264X8J', 'EOG09262R8O', 'EOG09262K67', 'EOG09260K5F', 'EOG09263LP3', 'EOG09264S3E', 'EOG09262FJB', 'EOG09260VEY', 'EOG09260TLW', 'EOG092608WI', 'EOG09261OXV', 'EOG09264PE5', 'EOG09261JWS', 'EOG09260NNR', 'EOG09264RJL', 'EOG09262KJA', 'EOG09260VYK', 'EOG092641UM', 'EOG092644N1', 'EOG09262V3O', 'EOG09260XMI', 'EOG09263BDA', 'EOG09264I6B', 'EOG092635ST', 'EOG0926071Q', 'EOG09264PK5', 'EOG09263D2P', 'EOG09260VHV', 'EOG09265RGS', 'EOG092603YJ', 'EOG092621ZV', 'EOG09261801', 'EOG09260ZWU', 'EOG09260PI1', 'EOG092607QZ', 'EOG09262W7C', 'EOG09264V3H', 'EOG09261UWT', 'EOG09264881', 'EOG09263E49', 'EOG09265KQ4', 'EOG09260AZM', 'EOG09264T3S', 'EOG09261TEQ', 'EOG09265BJ3', 'EOG0926522L', 'EOG09262CDO', 'EOG09262H34', 'EOG09264J8E', 'EOG09265FL1', 'EOG0926431P', 'EOG09263M8W', 'EOG09265FTY', 'EOG09262Z2S', 'EOG09264719', 'EOG092625AX', 'EOG09265HEP', 'EOG092618J9', 'EOG09260RVQ', 'EOG09263K45', 'EOG09264R1U', 'EOG09261ICI', 'EOG09263RVR', 'EOG09260WG2', 'EOG09263QUM', 'EOG09264ZWF', 'EOG092646WF', 'EOG09261OLD', 'EOG09263W48', 'EOG092632TF', 'EOG09265552', 'EOG09261D4D', 'EOG09264SET', 'EOG092627XA', 'EOG09262JRP', 'EOG09261P7G', 'EOG09262GNE', 'EOG092636T6', 'EOG092625P1', 'EOG092641M3', 'EOG09262POL', 'EOG09264Z3D', 'EOG09260K29', 'EOG092659DX', 'EOG09264G1I', 'EOG09260289', 'EOG09264C3N', 'EOG09262387', 'EOG09264HU0', 'EOG09264W7W', 'EOG09263WM5', 'EOG092629FB', 'EOG09260KM4', 'EOG092604A0', 'EOG09260FZZ', 'EOG09260GKG', 'EOG09262MJW', 'EOG09260XSR', 'EOG092621S9', 'EOG09261IEV', 'EOG09262TEV', 'EOG092641A6', 'EOG09263DQH', 'EOG09263YBT', 'EOG09263KVG', 'EOG092650VI', 'EOG092653O3', 'EOG09264441', 'EOG0926369X', 'EOG092643IE', 'EOG09261XZ6', 'EOG09264XUV', 'EOG092645OU', 'EOG09261I8J', 'EOG09263WWI', 'EOG09260NAN', 'EOG09260S2Z', 'EOG09264XYD', 'EOG0926484N', 'EOG09263FGN', 'EOG09260ETR', 'EOG0926506U', 'EOG09262KVB', 'EOG092605ZA', 'EOG0926248P', 'EOG092635DF', 'EOG092641K1', 'EOG0926315C', 'EOG092658QH', 'EOG09261JVS', 'EOG0926307V', 'EOG0926587S', 'EOG092604KQ', 'EOG09260J97', 'EOG09262HP3', 'EOG09264OQ8', 'EOG09263L7Y', 'EOG09261I0F', 'EOG09264ZDZ', 'EOG09262CXO', 'EOG09261I1I', 'EOG09261727', 'EOG09262BVA', 'EOG09265QTV', 'EOG092605VL', 'EOG09260KDB', 'EOG092617S2', 'EOG09262YP5', 'EOG0926407T', 'EOG092629RT', 'EOG092605OK', 'EOG09260EPS', 'EOG09265JNA', 'EOG09260DBG', 'EOG09260NZ8', 'EOG092621F2', 'EOG09261IOS', 'EOG0926539T', 'EOG09264W1U', 'EOG09260KNR', 'EOG09263PWF', 'EOG092610VI', 'EOG09264KDO', 'EOG09261G1Y', 'EOG09262IY3', 'EOG09261VD2', 'EOG09263KDI', 'EOG092658SK', 'EOG09265A08', 'EOG09263K05', 'EOG09263QPR', 'EOG092644WX', 'EOG092631ML', 'EOG09260KUC', 'EOG09262M0W', 'EOG092658NW', 'EOG09263XN3', 'EOG0926506Z', 'EOG09263U71', 'EOG09262TUR', 'EOG09265040', 'EOG092655IF', 'EOG09262E7I', 'EOG092641G3', 'EOG09261XNU', 'EOG09260EE7', 'EOG092645QN', 'EOG0926092K', 'EOG09263MR4', 'EOG09264XVU', 'EOG092610KH', 'EOG09261WJ8', 'EOG09261HZD', 'EOG09261SS1', 'EOG09261CQG', 'EOG0926273Q', 'EOG092619L1', 'EOG09265CCT', 'EOG09260KIY', 'EOG09262N5O', 'EOG092604ZZ', 'EOG09260R9L', 'EOG092654KW', 'EOG092615Y4', 'EOG09261CWO', 'EOG09260NXC', 'EOG09265G5K', 'EOG092612XD', 'EOG092605T6', 'EOG09261ZFN', 'EOG092620FM', 'EOG092646C6', 'EOG09264VC6', 'EOG092649VG', 'EOG09260LVD', 'EOG09265PWR', 'EOG09262PPU', 'EOG09262F22', 'EOG092615CC', 'EOG092616YZ', 'EOG09264RQY', 'EOG092616QN', 'EOG0926400M', 'EOG092648O6', 'EOG09264KO7', 'EOG09264NDD', 'EOG09262GWQ', 'EOG0926458I', 'EOG0926115V', 'EOG09265M98', 'EOG09260TVA', 'EOG09261RWJ', 'EOG09264A2D', 'EOG09260UA2', 'EOG092634MM', 'EOG09265IT6', 'EOG09263760', 'EOG092642UD', 'EOG092609O9', 'EOG09265FTN', 'EOG09265EKJ', 'EOG0926534P', 'EOG09263KZJ', 'EOG09261DG0', 'EOG09260NHN', 'EOG09262OX9', 'EOG09261T98', 'EOG09260WCZ', 'EOG09262HKC', 'EOG09263F11', 'EOG09261G92', 'EOG09262U7S', 'EOG09264VZ7', 'EOG092602I6', 'EOG09262E4T', 'EOG09262WQX', 'EOG09265HP0', 'EOG09264SSI', 'EOG09260FMW', 'EOG092612AK', 'EOG092600SD', 'EOG09261ACJ', 'EOG09260ZG2', 'EOG09263Y3L', 'EOG09261NLY', 'EOG092655SO', 'EOG092609RF', 'EOG09263CAC', 'EOG09261ABB', 'EOG09264272', 'EOG092651BA', 'EOG09265L8N', 'EOG09261OSU', 'EOG09262MEK', 'EOG09263UN3', 'EOG09260DP1', 'EOG09261AMX', 'EOG09262UAS', 'EOG09262SI7', 'EOG09263KRO', 'EOG09261TPN', 'EOG09260T4S', 'EOG092610QT', 'EOG09262X7T', 'EOG092629ZN', 'EOG092634B1', 'EOG092620EL', 'EOG0926009O', 'EOG09264G0H', 'EOG09262528', 'EOG09260QNB', 'EOG09261EM7', 'EOG092617RY', 'EOG092646CB', 'EOG09261O4Y', 'EOG09263G4R', 'EOG0926248W', 'EOG09260T28', 'EOG092624KK', 'EOG09263OD3', 'EOG09261Q18', 'EOG092658WY', 'EOG09265GSM', 'EOG09265B95', 'EOG092604I8', 'EOG09264FXB', 'EOG09264ZQC', 'EOG09264PI4', 'EOG09262VPD', 'EOG09262QS5', 'EOG09261ZPW', 'EOG09263ZBF', 'EOG09262YAU', 'EOG09262SMG', 'EOG092608ZS', 'EOG0926229Z', 'EOG09261YRA', 'EOG09263EQZ', 'EOG09260TWS', 'EOG09265OQH', 'EOG09263720', 'EOG092653NM', 'EOG09260AZK', 'EOG09261AH9', 'EOG09265B1X', 'EOG09263817', 'EOG0926112A', 'EOG092601KZ', 'EOG09264X31', 'EOG09264398', 'EOG09261N2L', 'EOG09262LI4', 'EOG0926074Y', 'EOG09260FPA', 'EOG09264MGU', 'EOG092626EQ', 'EOG09264U81', 'EOG09265FCK', 'EOG09260BFE', 'EOG09264CA0', 'EOG092603EH', 'EOG092653VU', 'EOG09262NB1', 'EOG092619MJ', 'EOG09260CKC', 'EOG09261DHR', 'EOG09262TO9', 'EOG092625U6', 'EOG09263MGE', 'EOG09264PDD', 'EOG09263IMF', 'EOG092648K5', 'EOG092602MO', 'EOG09263C55', 'EOG09260EZT', 'EOG09264NC7', 'EOG09262JAT', 'EOG09260E8K', 'EOG0926133I', 'EOG092612CC', 'EOG092600SK', 'EOG092648LP', 'EOG09260VTN', 'EOG092648VW', 'EOG09264O6J', 'EOG0926514P', 'EOG09263W7L', 'EOG09262LYR', 'EOG09265PQX', 'EOG09263QH4', 'EOG09260DXP', 'EOG09260WU6', 'EOG09263NE7', 'EOG09265G9U', 'EOG0926388H', 'EOG0926425H', 'EOG09264HTG', 'EOG09260EAZ', 'EOG0926357F', 'EOG09262JWJ', 'EOG092608RH', 'EOG092629WA', 'EOG092657UN', 'EOG09265PUI', 'EOG0926419M', 'EOG09264JHE', 'EOG09263OQH', 'EOG092638CT', 'EOG09262CBI', 'EOG09262X01', 'EOG092640BS', 'EOG09264DY4', 'EOG09264Y0W', 'EOG092619VG', 'EOG092651FJ', 'EOG09261LPY', 'EOG09261OXD', 'EOG09262ESR', 'EOG0926251E', 'EOG0926310O', 'EOG09264T8I', 'EOG092602FH', 'EOG092607OQ', 'EOG09265NHW', 'EOG09264331', 'EOG09261666', 'EOG09260LRX', 'EOG09260A27', 'EOG09262N10', 'EOG09261B18', 'EOG09260SAH', 'EOG09260ERO', 'EOG09261Y04', 'EOG09261EU7', 'EOG09263EVJ', 'EOG09263MEM', 'EOG09260274', 'EOG09264OYZ', 'EOG09264DT4', 'EOG09263OZR', 'EOG09261W90', 'EOG0926347W', 'EOG09264NEF', 'EOG09264LC7', 'EOG09263FR7', 'EOG09260AQB', 'EOG0926306O', 'EOG09260QVP', 'EOG09261JUE', 'EOG09261I1G', 'EOG09264XOZ', 'EOG09260SIZ', 'EOG09264LBC', 'EOG09262V8E', 'EOG09262GXD', 'EOG09263C4C', 'EOG09260RRC', 'EOG092640WA', 'EOG09263A5D', 'EOG09265313', 'EOG092632WW', 'EOG09263U08', 'EOG09265SHM', 'EOG09260SL3', 'EOG092619GP', 'EOG09263690', 'EOG09263ULA', 'EOG09264RIE', 'EOG09262CMP', 'EOG0926073O', 'EOG09264NJ1', 'EOG09263OAE', 'EOG09263BE5', 'EOG09260RS7', 'EOG09260NY2', 'EOG09261O7R', 'EOG092653YS', 'EOG092657YR', 'EOG09260WUA', 'EOG09262JZW', 'EOG09263LNF', 'EOG09264THP', 'EOG09260Z3X', 'EOG0926115P', 'EOG09261WVT', 'EOG09262E4Q', 'EOG09265I60', 'EOG09262DUV', 'EOG09261C0G', 'EOG09261XNJ', 'EOG092658X5', 'EOG092658CI', 'EOG09263A3Y', 'EOG09263IQ5', 'EOG092654LJ', 'EOG09260KGS', 'EOG09262MXH', 'EOG092611HB', 'EOG09263J6Z', 'EOG09260BRA', 'EOG09264903', 'EOG09262GVX', 'EOG09263R4M', 'EOG09264IIZ', 'EOG09262NNS', 'EOG092606AD', 'EOG09263ZW6', 'EOG09263JTO', 'EOG092651K1', 'EOG09263Q8J', 'EOG09261X9E', 'EOG09262PAY', 'EOG09262CUO', 'EOG09261B3Q', 'EOG09263L9T', 'EOG09260W9L', 'EOG09263X1F', 'EOG09263YFX', 'EOG09260DUR', 'EOG09261DW8', 'EOG092654VM', 'EOG09260NJW', 'EOG09260JTZ', 'EOG09263YFH', 'EOG09260JED', 'EOG092613QA', 'EOG09263KB4', 'EOG09262GLP', 'EOG09265GGX', 'EOG092625OH', 'EOG09265KSE', 'EOG09262FE3', 'EOG09264I14', 'EOG09264L0C', 'EOG09263E5F', 'EOG0926448Q', 'EOG09264KIV', 'EOG092645G0', 'EOG09261JCQ', 'EOG09265FI4', 'EOG09265KPR', 'EOG09260GIX', 'EOG09264904', 'EOG09260P2K', 'EOG09262WXK', 'EOG09264COX', 'EOG09260SRF', 'EOG09265IBC', 'EOG09264I9J', 'EOG092656JA', 'EOG0926213Z', 'EOG092635YY', 'EOG09264AWW', 'EOG09264V2U', 'EOG092645L9', 'EOG092624SJ', 'EOG09260075', 'EOG09260AZA', 'EOG092654XA', 'EOG092620CR', 'EOG09263X0V', 'EOG092655M5', 'EOG092648Q0', 'EOG09260R84', 'EOG092626HU', 'EOG09263XVS', 'EOG092600NM', 'EOG092659OC', 'EOG09263BG5', 'EOG09264T0J', 'EOG09263GSR', 'EOG092652YI', 'EOG092654QZ', 'EOG09264ZXA', 'EOG09263SFX', 'EOG09262XMN', 'EOG09262645', 'EOG09264CP4', 'EOG092600S9', 'EOG09264W46', 'EOG09262XGK', 'EOG09263E9V', 'EOG09262UTQ', 'EOG09264NNY', 'EOG09265KNL', 'EOG09265PJ3', 'EOG09260HS3', 'EOG092605KN', 'EOG092634B5', 'EOG09263HD8', 'EOG0926142Y', 'EOG09261YLQ', 'EOG09262WHU', 'EOG09265E6R', 'EOG09261660', 'EOG092619RJ', 'EOG09264XF3', 'EOG09263E87', 'EOG092649XV', 'EOG09264WF4', 'EOG09261HQU', 'EOG09261MPU', 'EOG09260V8Q', 'EOG09265G5C', 'EOG0926049A', 'EOG092643Y5', 'EOG0926079Q', 'EOG09262DPL', 'EOG092621GA', 'EOG09262XRU', 'EOG09263PXH', 'EOG092624X0', 'EOG092652TN', 'EOG09260OE9', 'EOG09261NW2', 'EOG092653KS', 'EOG09260KNI', 'EOG09265BE5', 'EOG09264G7L', 'EOG09261F73', 'EOG09264LJU', 'EOG092639H5', 'EOG09264RW6', 'EOG092620E4', 'EOG09263GIG', 'EOG09260OLB', 'EOG09263JW5', 'EOG092620U5', 'EOG09262QTY', 'EOG092606CY', 'EOG09264OBA', 'EOG092653SU', 'EOG092643VB', 'EOG09260N2T', 'EOG092608AE', 'EOG0926499W', 'EOG0926049S', 'EOG09262D4G', 'EOG09264YEG', 'EOG09265JVH', 'EOG09265BTC', 'EOG092644O2', 'EOG09263CUQ', 'EOG0926004Z', 'EOG09261127', 'EOG09262QJW', 'EOG09263RF8', 'EOG09264P4J', 'EOG09265DRM', 'EOG09260JT9', 'EOG09260A98', 'EOG09265DWT', 'EOG092615SM', 'EOG09264873', 'EOG09263CGP', 'EOG09263L52', 'EOG0926195C', 'EOG09260OQ8', 'EOG092602OP', 'EOG09262A65', 'EOG09261OIA', 'EOG09260GR8', 'EOG092646EZ', 'EOG09260W52', 'EOG092600T4', 'EOG09260B3H', 'EOG09264XM2', 'EOG092644ZU', 'EOG09264R0D', 'EOG09261B14', 'EOG09260J53', 'EOG092647L7', 'EOG092632FC', 'EOG09261476', 'EOG09261S7X', 'EOG09262BCZ', 'EOG09264YIJ', 'EOG0926386D', 'EOG092620IA', 'EOG0926384F', 'EOG092640IZ', 'EOG0926436T', 'EOG09264V30', 'EOG09262D0D', 'EOG092624RX', 'EOG09264G4X', 'EOG09263SZM', 'EOG09260RRN', 'EOG09263AZP', 'EOG09261FAB', 'EOG092646VF', 'EOG09263HED', 'EOG09261W2O', 'EOG09264CND', 'EOG0926390Q', 'EOG09261K9J', 'EOG09264BIX', 'EOG092617AN', 'EOG09260JJW', 'EOG09262ZZ8', 'EOG09264IQ7', 'EOG092644Z6', 'EOG09261JR0', 'EOG092605FC', 'EOG09263CLY', 'EOG092643NE', 'EOG092652Y6', 'EOG0926213Q', 'EOG092610ZY', 'EOG09264IDN', 'EOG092643JW', 'EOG09260JNY', 'EOG0926477X', 'EOG09265GYD', 'EOG092605QM', 'EOG09262KXK', 'EOG09263J3H', 'EOG09264HOY', 'EOG092617RN', 'EOG09261CXQ', 'EOG09262E3Q', 'EOG092614UB', 'EOG092652NQ', 'EOG092627F1', 'EOG09263S2P', 'EOG092612MY', 'EOG09262SWJ', 'EOG09264W71', 'EOG09264PD5', 'EOG09261Q5L', 'EOG09260KWP', 'EOG09260RGH', 'EOG092634G9', 'EOG09263RDD', 'EOG09264B74', 'EOG09261H5E', 'EOG09262YQG', 'EOG09260S5R', 'EOG09261UMG', 'EOG09264TQ5', 'EOG092603JK', 'EOG09264F1U', 'EOG09260FL2', 'EOG09261S0S', 'EOG09263DFA', 'EOG0926077L', 'EOG09260LI6', 'EOG09263IT0', 'EOG09260W8D', 'EOG09264IG5', 'EOG09261EMF', 'EOG09262A8G', 'EOG09260Y2Q', 'EOG09265D7J', 'EOG09260Z5E', 'EOG09261031', 'EOG092621YU', 'EOG092630MJ', 'EOG092634M1', 'EOG09263FAK', 'EOG09261N64', 'EOG09260NE0', 'EOG09262X74', 'EOG09260BSW', 'EOG092609YT', 'EOG09263X4B', 'EOG09264CP8', 'EOG092638XA', 'EOG09264OXC', 'EOG092658WS', 'EOG09260S3L', 'EOG09262H50', 'EOG092608AV', 'EOG09263RY2', 'EOG092631QQ', 'EOG09260TDY', 'EOG09262L1P', 'EOG092656RK', 'EOG09263Z41', 'EOG092636Y6', 'EOG09264DMU', 'EOG09261NLR', 'EOG09262X8R', 'EOG09264SUZ', 'EOG092657H8', 'EOG09264OM7', 'EOG0926591L', 'EOG09260FFP', 'EOG09263H5H', 'EOG09264B3O', 'EOG09264JW6', 'EOG09261YV6', 'EOG09262E7W', 'EOG09261A3K', 'EOG092646PE', 'EOG09260B3X', 'EOG09260JPV', 'EOG0926312D', 'EOG09260C2V', 'EOG09264VRO', 'EOG092628LW', 'EOG09264O2D', 'EOG09260LJ8', 'EOG092609JB', 'EOG09262HA6', 'EOG09264DOU', 'EOG09260U6R', 'EOG09261PJZ', 'EOG09260EPQ', 'EOG09261UPM', 'EOG092649QJ', 'EOG09261DJC', 'EOG09260JO5', 'EOG09263GUT', 'EOG09264GMT', 'EOG09264ENO', 'EOG09263WB5', 'EOG09260K4V', 'EOG09261VI3', 'EOG09260K24', 'EOG09260XOG', 'EOG09265BTA', 'EOG092628HC', 'EOG09264DYD', 'EOG09262H1X', 'EOG09264DJ8', 'EOG09264XKX', 'EOG092600W1', 'EOG09262VOC', 'EOG09261LV7', 'EOG092652KR', 'EOG09261VUC', 'EOG09262KZ3', 'EOG092631UM', 'EOG09262IZO', 'EOG09262UB3', 'EOG09262ILV', 'EOG09261MOX', 'EOG09263X22', 'EOG09263FKC', 'EOG09260WGT', 'EOG09264X41', 'EOG09263KEE', 'EOG09264JQ1', 'EOG09260EQD', 'EOG09261V9P', 'EOG092656IY', 'EOG09265CQO', 'EOG09264XTW', 'EOG09260H81', 'EOG09263ZSC', 'EOG092655L0', 'EOG092648XW', 'EOG09264ZDJ', 'EOG09264LH2', 'EOG09263FTE', 'EOG09265A4E', 'EOG092604ML', 'EOG092615IE', 'EOG09262914', 'EOG09263HYJ', 'EOG09262Q8D', 'EOG09263JFQ', 'EOG09264C3V', 'EOG092612LP', 'EOG09260NC6', 'EOG09264E6Z', 'EOG09264V51', 'EOG092643QM', 'EOG09262Z0M', 'EOG09265AT5', 'EOG09264MZ1', 'EOG092606AJ', 'EOG09264829', 'EOG09264SQJ', 'EOG0926131E', 'EOG09260MRU', 'EOG09263274', 'EOG0926423H', 'EOG09262B1U', 'EOG09260VCG', 'EOG09264D7Y', 'EOG09264L6D', 'EOG09264BJC', 'EOG09262MFL', 'EOG09263OWL', 'EOG092645U1', 'EOG09260NWN', 'EOG0926025H', 'EOG09262DC1', 'EOG09262K8C', 'EOG092611G7', 'EOG09260LBU', 'EOG09262JZK', 'EOG092648U2', 'EOG09264OBO', 'EOG09263EBB', 'EOG09262IZ6', 'EOG09264G1F', 'EOG09260XQV', 'EOG09262QRH', 'EOG09264A8D', 'EOG09261B6Y', 'EOG09265DDU', 'EOG09265GXF', 'EOG09260SJV', 'EOG09265E8A', 'EOG09265H9T', 'EOG09261M78', 'EOG09265JA7', 'EOG0926137U', 'EOG09261FM4', 'EOG09261QXC', 'EOG09264UJF', 'EOG09264XT5', 'EOG09261MMR', 'EOG09262PMC', 'EOG09262HQM', 'EOG09263JZO', 'EOG09265ANI', 'EOG09261QR8', 'EOG09263XZN', 'EOG09260EOI', 'EOG09262N0U', 'EOG09262PIH', 'EOG092652ZZ', 'EOG09262A6N', 'EOG092642I5', 'EOG09260VTA', 'EOG09260E2O', 'EOG092600T9', 'EOG09265K60', 'EOG09263LR1', 'EOG092614E6', 'EOG09262M2J', 'EOG0926140Q', 'EOG092621CK', 'EOG092619EA', 'EOG09264LKR', 'EOG092621CP', 'EOG09262M7B', 'EOG09260779', 'EOG09262N47', 'EOG09265RGI', 'EOG09260XL0', 'EOG09261ZJR', 'EOG09261IEH', 'EOG09265C25', 'EOG09264RBX', 'EOG09263EDC', 'EOG092629U5', 'EOG09265822', 'EOG09265BBJ', 'EOG09261404', 'EOG09263RW3', 'EOG09260375', 'EOG09261LEU', 'EOG09262341', 'EOG092631IR', 'EOG092618M2', 'EOG09264U78', 'EOG09263CG4', 'EOG092645LS', 'EOG09262VVF', 'EOG09261XAF', 'EOG09261N20', 'EOG09260UXC', 'EOG09265HPJ', 'EOG09261V2P', 'EOG09262516', 'EOG09260LS9', 'EOG09260KCB', 'EOG09260V5Q', 'EOG09260FKU', 'EOG09264OML', 'EOG09262V9N', 'EOG09265CN0', 'EOG09260HPO', 'EOG092644DY', 'EOG09260XH5', 'EOG09262F7P', 'EOG09263A64', 'EOG0926505R', 'EOG09264RJ7', 'EOG09261QR5', 'EOG09263ZBJ', 'EOG09261PUF', 'EOG09262SR7', 'EOG09260RNZ', 'EOG09260WYQ', 'EOG09260BYL', 'EOG092644X6', 'EOG09263G3M', 'EOG0926510L', 'EOG092606WZ', 'EOG092638EN', 'EOG0926489S', 'EOG09264JK1', 'EOG092608T8', 'EOG09264RR2', 'EOG09264O9F', 'EOG092600X0', 'EOG09264EGS', 'EOG092634ZL', 'EOG09264ABT', 'EOG09264FYY', 'EOG09260M87', 'EOG09264B2P', 'EOG092630YS', 'EOG09264HN5', 'EOG092644TW', 'EOG09262G8Y', 'EOG09263MNN', 'EOG09263BGW', 'EOG09261ONU', 'EOG09264L06', 'EOG09261KHB', 'EOG092603D0', 'EOG09262J7K', 'EOG09261M4S', 'EOG09262WSH', 'EOG09263Z8I', 'EOG09265EOF', 'EOG092604MJ', 'EOG09261V03', 'EOG09260HSP', 'EOG092608WU', 'EOG09260NHB', 'EOG092638RC', 'EOG0926158Y', 'EOG09260GF5', 'EOG09263OCO', 'EOG09260AEC', 'EOG092610LQ', 'EOG092603SM', 'EOG09260DFH', 'EOG092624UF', 'EOG09265I7S', 'EOG092614DJ', 'EOG09263CG2', 'EOG09263I7I', 'EOG09260N53', 'EOG09264BWL', 'EOG09260J8F', 'EOG09264T1U', 'EOG09265KF9', 'EOG09264KTU', 'EOG092613R2', 'EOG09264TJN', 'EOG0926354S', 'EOG0926420U', 'EOG09261ZXZ', 'EOG092654O3', 'EOG09265FGA', 'EOG09260MBW', 'EOG09264CST', 'EOG092619NK', 'EOG09263WZ2', 'EOG09264UD3', 'EOG092650I8', 'EOG09261FH7', 'EOG09261225', 'EOG09264KK7', 'EOG09264YKY', 'EOG09262P1W', 'EOG09262C5Z', 'EOG09262KUJ', 'EOG09264L3W', 'EOG09264G04', 'EOG09260WUS', 'EOG09264XPY', 'EOG09264FVQ', 'EOG09260DUW', 'EOG092653LT', 'EOG09265LEG', 'EOG092656CM', 'EOG09264Z1B', 'EOG09263TQ5', 'EOG092658ZO', 'EOG09260OM6', 'EOG092635SS', 'EOG09261KRX', 'EOG092647CM', 'EOG09265BG5', 'EOG09264IKZ', 'EOG09261UOJ', 'EOG09263UWJ', 'EOG09260HMA', 'EOG09262Q6S', 'EOG09260APA', 'EOG09264MN3', 'EOG09265591', 'EOG09265ER6', 'EOG09262I0R', 'EOG09260931', 'EOG092633QB', 'EOG09261RFF', 'EOG092603KJ', 'EOG09262BHE', 'EOG09262IP2', 'EOG09264DIM', 'EOG09262E98', 'EOG092649VA', 'EOG09264YHS', 'EOG09260PJ9', 'EOG092628SP', 'EOG09264K2W', 'EOG09264IV9', 'EOG09261W1K', 'EOG09260JDM', 'EOG09260H6E', 'EOG092613UB', 'EOG09264P74', 'EOG09261V87', 'EOG092624JL', 'EOG09262O0R', 'EOG09262PZ9', 'EOG09264GQZ', 'EOG09261HB6', 'EOG09264IOS', 'EOG09262MOO', 'EOG09261CM0', 'EOG09265K4K', 'EOG09265AL8', 'EOG09261EY9', 'EOG092651HW', 'EOG09260B65', 'EOG092629WJ', 'EOG092628FW', 'EOG09260OZU', 'EOG09261OKK', 'EOG092604S1', 'EOG092631MU', 'EOG09264USX', 'EOG09260TPT', 'EOG09261G4Z', 'EOG09261RWU', 'EOG09262LQT', 'EOG092605VU'] were not found in the ancestral_variants file INFO Running tblastn, writing output to /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/blast_output/tblastn_awesome_busco_missing_and_frag_rerun.tsv... INFO [tblastn] Warning: [tblastn] Query is Empty! INFO Getting coordinates for candidate regions... INFO ** Step 2/3, current time: 06/29/2022 21:25:01 ** INFO Training Augustus using Single-Copy Complete BUSCOs: INFO 06/29/2022 21:25:01 => Converting predicted genes to short genbank files... INFO 06/29/2022 21:25:07 => All files converted to short genbank files, now running the training scripts... INFO Pre-Augustus scaffold extraction... INFO Re-running Augustus with the new metaparameters, number of target BUSCOs: 1135 INFO 06/29/2022 21:25:09 => 0% of predictions performed (0 to be done) INFO 06/29/2022 21:25:09 => 100% of predictions performed INFO Extracting predicted proteins... INFO ** Step 3/3, current time: 06/29/2022 21:25:09 ** INFO Running HMMER to confirm orthology of predicted proteins: INFO 06/29/2022 21:25:09 => 0% of predictions performed (0 to be done) INFO 06/29/2022 21:25:09 => 100% of predictions performed INFO Results: INFO C:13.5%[S:13.3%,D:0.2%],F:0.1%,M:86.4%,n:1312 INFO 177 Complete BUSCOs (C) INFO 175 Complete and single-copy BUSCOs (S) INFO 2 Complete and duplicated BUSCOs (D) INFO 1 Fragmented BUSCOs (F) INFO 1134 Missing BUSCOs (M) INFO 1312 Total BUSCO groups searched
INFO BUSCO analysis done with WARNING(s). Total running time: 300.09778451919556 seconds INFO Results written in /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco/run_awesome_busco/
INFO ** Start a BUSCO 2.0 analysis, current time: 06/29/2022 21:25:30 ** INFO The lineage dataset is: dikarya_odb9 (eukaryota) INFO Mode is: proteins INFO To reproduce this run: python /hpc/home/idm7/miniconda3/envs/annotate/lib/python3.8/site-packages/funannotate/aux_scripts/funannotate-BUSCO2.py -i /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco_augustus.proteins.fasta -o awesome_busco -l /hpc/group/bio1/ian/envs/funannotate_db/dikarya/ -m proteins -c 10 -sp anidulans INFO Check dependencies... INFO Check input file... INFO Temp directory is ./tmp/ INFO Running HMMER on the proteins: INFO 06/29/2022 21:25:30 => 0% of predictions performed (1312 to be done) INFO 06/29/2022 21:25:32 => 10% of predictions performed (134/1312 candidate proteins) INFO 06/29/2022 21:25:33 => 20% of predictions performed (263/1312 candidate proteins) INFO 06/29/2022 21:25:35 => 30% of predictions performed (396/1312 candidate proteins) INFO 06/29/2022 21:25:38 => 40% of predictions performed (525/1312 candidate proteins) INFO 06/29/2022 21:25:41 => 50% of predictions performed (659/1312 candidate proteins) INFO 06/29/2022 21:25:44 => 60% of predictions performed (791/1312 candidate proteins) INFO 06/29/2022 21:25:47 => 70% of predictions performed (922/1312 candidate proteins) INFO 06/29/2022 21:25:52 => 80% of predictions performed (1054/1312 candidate proteins) INFO 06/29/2022 21:25:56 => 90% of predictions performed (1181/1312 candidate proteins) INFO 06/29/2022 21:26:00 => 100% of predictions performed INFO Results: INFO C:13.3%[S:13.3%,D:0.0%],F:0.0%,M:86.7%,n:1312 INFO 175 Complete BUSCOs (C) INFO 175 Complete and single-copy BUSCOs (S) INFO 0 Complete and duplicated BUSCOs (D) INFO 0 Fragmented BUSCOs (F) INFO 1137 Missing BUSCOs (M) INFO 1312 Total BUSCO groups searched
INFO BUSCO analysis done. Total running time: 33.35695195198059 seconds INFO Results written in /hpc/group/bio1/ian/envs/annotate/test-busco_b548c447-5a92-4bca-b6c4-d116e6f7177e/annotate/predict_misc/busco_proteins/run_awesome_busco/
`
OS/Install Information
Installed using mamba on computing cluster running Red Hat Enterprise Linux 8.
`------------------------------------------------------- Checking dependencies for 1.8.11
You are running Python v 3.8.12. Now checking python packages... biopython: 1.77 goatools: 1.2.3 matplotlib: 3.4.3 natsort: 8.1.0 numpy: 1.23.0 pandas: 1.4.3 psutil: 5.9.1 requests: 2.28.1 scikit-learn: 1.1.1 scipy: 1.8.1 seaborn: 0.11.2 All 11 python packages installed
You are running Perl v b'5.026002'. Now checking perl modules... Carp: 1.38 Clone: 0.42 DBD::SQLite: 1.64 DBD::mysql: 4.046 DBI: 1.642 DB_File: 1.855 Data::Dumper: 2.173 File::Basename: 2.85 File::Which: 1.23 Getopt::Long: 2.5 Hash::Merge: 0.300 JSON: 4.02 LWP::UserAgent: 6.39 Logger::Simple: 2.0 POSIX: 1.76 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.12 Tie::File: 1.02 URI::Escape: 3.31 YAML: 1.29 local::lib: 2.000024 threads: 2.15 threads::shared: 1.56 All 27 Perl modules installed
Checking Environmental Variables... $FUNANNOTATE_DB=/hpc/group/bio1/ian/envs/funannotate_db $PASAHOME=/hpc/home/idm7/miniconda3/envs/annotate/opt/pasa-2.5.2 $TRINITY_HOME=/hpc/home/idm7/miniconda3/envs/annotate/opt/trinity-2.8.5 $EVM_HOME=/hpc/home/idm7/miniconda3/envs/annotate/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/hpc/home/idm7/miniconda3/envs/annotate/config/ $GENEMARK_PATH=/hpc/group/bio1/ian/envs/funannotate/gmes_petap All 6 environmental variables are set
Checking external dependencies... PASA: 2.5.2 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.3.3 bamtools: bamtools 2.5.1 bedtools: bedtools v2.30.0 blat: BLAT v35 diamond: 2.0.15 ete3: 3.1.2 exonerate: exonerate 2.4.0 fasta: no way to determine glimmerhmm: 3.0.4 gmap: 2021-08-25 hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 11.0.1-internal kallisto: 0.46.1 mafft: v7.505 (2022/Apr/10) makeblastdb: makeblastdb 2.2.31+ minimap2: 2.24-r1122 pigz: pigz 2.7 proteinortho: 6.1.0 pslCDnaFilter: no way to determine salmon: salmon 0.14.1 samtools: samtools 1.12 snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.9 (July 2021) tantan: tantan 39 tbl2asn: no way to determine, likely 25.X tblastn: tblastn 2.2.31+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 ERROR: emapper.py not installed ERROR: gmes_petap.pl not installed ERROR: signalp not installed`