nextgenusfs / funannotate

Eukaryotic Genome Annotation Pipeline
http://funannotate.readthedocs.io
BSD 2-Clause "Simplified" License
300 stars 82 forks source link

funannotate annotate gb2parts() #951

Open SolayMane opened 10 months ago

SolayMane commented 10 months ago

Are you using the latest release? Yes

*Describe the bug I run funanotate annotate with the updated version on funannotate gbk file

What command did you issue? funannotate annotate --genbank /sanhome2/data/FOA/Compare/gbk/Fusarium_oxysporum_f.sp._albedinis.gbk --cpus 56 -o test1 Logfiles [Aug 22 01:20 PM]: OS: Ubuntu 14.04, 56 cores, ~ 396 GB RAM. Python: 3.8.15 [Aug 22 01:20 PM]: Running 1.8.15 [Aug 22 01:20 PM]: No NCBI SBT file given, will use default, however if you plan to submit to NCBI, create one and pass it here '--sbt' [Aug 22 01:20 PM]: Found existing output directory test1. Warning, will re-use any intermediate files found. [Aug 22 01:20 PM]: Checking GenBank file for annotation Traceback (most recent call last): File "/home/inra/miniconda2/envs/funannotate/bin/funannotate", line 8, in sys.exit(main()) File "/home/inra/miniconda2/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 717, in main mod.main(arguments) File "/home/inra/miniconda2/envs/funannotate/lib/python3.8/site-packages/funannotate/annotate.py", line 724, in main GeneCounts = lib.gb2parts( TypeError: gb2parts() missing 1 required positional argument: 'dna'

OS/Install Information

Checking dependencies for 1.8.15

You are running Python v 3.8.15. Now checking python packages... biopython: 1.76 goatools: 1.2.3 matplotlib: 3.7.2 natsort: 8.3.1 numpy: 1.24.3 pandas: 1.5.3 psutil: 5.7.0 requests: 2.31.0 scikit-learn: 1.3.0 scipy: 1.10.1 seaborn: 0.12.2 All 11 python packages installed

You are running Perl v b'5.032001'. Now checking perl modules... Carp: 1.50 Clone: 0.46 DBD::SQLite: 1.72 DBD::mysql: 4.046 DBI: 1.643 DB_File: 1.858 Data::Dumper: 2.183 File::Basename: 2.85 File::Which: 1.24 Getopt::Long: 2.54 Hash::Merge: 0.302 JSON: 4.10 LWP::UserAgent: 6.67 Logger::Simple: 2.0 POSIX: 1.94 Parallel::ForkManager: 2.02 Pod::Usage: 1.69 Scalar::Util::Numeric: 0.40 Storable: 3.15 Text::Soundex: 3.05 Thread::Queue: 3.14 Tie::File: 1.06 URI::Escape: 5.17 YAML: 1.30 local::lib: 2.000029 threads: 2.25 threads::shared: 1.61 All 27 Perl modules installed

Checking Environmental Variables... $FUNANNOTATE_DB=/your/path $PASAHOME=/home/inra/miniconda2/envs/funannotate/opt/pasa-2.5.2 $TRINITY_HOME=/home/inra/miniconda2/envs/funannotate/opt/trinity-2.8.5 $EVM_HOME=/home/inra/miniconda2/envs/funannotate/opt/evidencemodeler-1.1.1 $AUGUSTUS_CONFIG_PATH=/home/inra/miniconda2/envs/funannotate/config/ ERROR: GENEMARK_PATH not set. export GENEMARK_PATH=/path/to/dir

Checking external dependencies... PASA: 2.5.2 CodingQuarry: 2.0 Trinity: 2.8.5 augustus: 3.5.0 bamtools: bamtools 2.5.1 bedtools: bedtools v2.31.0 blat: BLAT v37x1 diamond: 2.1.7 emapper.py: 2.1.1 ete3: 3.1.2 exonerate: exonerate 2.4.0 fasta: 36.3.8g glimmerhmm: 3.0.4 gmap: 2023-04-28 hisat2: 2.2.1 hmmscan: HMMER 3.3.2 (Nov 2020) hmmsearch: HMMER 3.3.2 (Nov 2020) java: 11.0.13 kallisto: 0.46.1 mafft: v7.520 (2023/Mar/22) makeblastdb: makeblastdb 2.14.0+ minimap2: 2.26-r1175 pigz: 2.6 proteinortho: 6.2.3 pslCDnaFilter: no way to determine salmon: salmon 0.14.1 samtools: samtools 1.16.1 signalp: 5.0b snap: 2006-07-28 stringtie: 2.2.1 tRNAscan-SE: 2.0.11 (Oct 2022) tantan: tantan 40 tbl2asn: 25.8 tblastn: tblastn 2.14.0+ trimal: trimAl v1.4.rev15 build[2013-12-17] trimmomatic: 0.39 ERROR: gmes_petap.pl not installed

nextgenusfs commented 10 months ago

Oh @hyphaltip adding the cds feature here breaks this function. @SolayMane must be using the latest in master and not the v1.8.15 release version? We should have updated version for the code in master.

https://github.com/nextgenusfs/funannotate/commit/667e55c88f2441ad29a70d3ed8b107f1c394da9c

nextgenusfs commented 10 months ago

@SolayMane should be able to roll this back like this while we get the master cleaned up/fixed.

python -m pip install git+https://github.com/nextgenusfs/funannotate.git@eac3691 --upgrade --force --no-deps
hyphaltip commented 10 months ago

I made a test fix on a branch as bfcecf3

I think this might be able to solve that - I still need to test same code as @SolayMane can you post link to the genbank file you are running the reannotation on - the fusarium you refer to seems like it is only un-annotated in genbank?

I dowloaded another Foxy genome and tried this which works, so I think my code fix is working. The product is there is a genome.cds.fasta created in the annotate_misc folder - is that okay @nextgenusfs ? Or should we just provide an alternate way to run that function without generating a CDS file too (eg have a default as empty string for the CDS file name and then to skip if that is empty)

curl -O https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/013/085/055/GCA_013085055.1_ASM1308505v1/GCA_013085055.1_ASM1308505v1_genomic.gbff.gz
gunzip GCA_013085055.1_ASM1308505v1_genomic.gbff.gz
funannotate annotate --genbank GCA_013085055.1_ASM1308505v1_genomic.gbff --cpus 16 -o test1
nextgenusfs commented 10 months ago

@hyphaltip its fine to generate the cds every time it run I think -- it won't be used for anything downstream but its not really a problem.

nextgenusfs commented 10 months ago

@hyphaltip can you also bump the version in your branch to 1.8.16 before we merge so we know its from the unstable master and not the previous release when there are issues? Thanks.

hyphaltip commented 10 months ago

@SolayMane you can try again with python -m pip install git+https://github.com/nextgenusfs/funannotate.git to get latest version from master branch - should be 1.8.16

SolayMane commented 10 months ago

Thanks @hyphaltip @nextgenusfs it seems to work, but at the end i got that: funannotate annotate --genbank /sanhome2/data/FOA/Compare/gbk/Fusarium_oxysporum_f.sp._albedinis.gbk --cpus 56 -o test1

[Aug 23 11:13 AM]: OS: Ubuntu 14.04, 56 cores, ~ 396 GB RAM. Python: 3.8.15 [Aug 23 11:13 AM]: Running 1.8.16 [Aug 23 11:13 AM]: No NCBI SBT file given, will use default, however if you plan to submit to NCBI, create one and pass it here '--sbt' [Aug 23 11:13 AM]: Found existing output directory test1. Warning, will re-use any intermediate files found. [Aug 23 11:13 AM]: Checking GenBank file for annotation [Aug 23 11:14 AM]: Adding Functional Annotation to Fusarium oxysporum f.sp. albedinis, NCBI accession: None [Aug 23 11:14 AM]: Annotation consists of: 20,416 gene models [Aug 23 11:14 AM]: 20,416 protein records loaded [Aug 23 11:14 AM]: Running HMMer search of PFAM version 35.0 [Aug 23 11:17 AM]: 20,269 annotations added [Aug 23 11:17 AM]: Running Diamond blastp search of UniProt DB version 2023_03 [Aug 23 11:17 AM]: 1,105 valid gene/product annotations from 1,582 total [Aug 23 11:17 AM]: Running Eggnog-mapper [Aug 23 11:38 AM]: Parsing EggNog Annotations [Aug 23 11:38 AM]: EggNog version parsed as 2.1.1 Traceback (most recent call last): File "/home/inra/miniconda2/envs/funannotate/bin/funannotate", line 8, in sys.exit(main()) File "/home/inra/miniconda2/envs/funannotate/lib/python3.8/site-packages/funannotate/funannotate.py", line 717, in main mod.main(arguments) File "/home/inra/miniconda2/envs/funannotate/lib/python3.8/site-packages/funannotate/annotate.py", line 1064, in main EggNog = parseEggNoggMapper(eggnog_result, eggnog_out, GeneProducts) File "/home/inra/miniconda2/envs/funannotate/lib/python3.8/site-packages/funannotate/annotate.py", line 367, in parseEggNoggMapper OGs = cols[DBi].split(",") TypeError: list indices must be integers or slices, not NoneType