NBISweden / EMBLmyGFF3

An efficient way to convert gff3 annotation files into EMBL format ready to submit.
GNU General Public License v3.0
59 stars 16 forks source link

unexpected keyword argument 'strand' #82

Open webbchen opened 8 months ago

webbchen commented 8 months ago

Dear developers.

Traceback (most recent call last):
  File "/mnt/shared/scratch/awebb/apps/conda/envs/ENSEMBL_submission_env/bin/EMBLmyGFF3", line 10, in <module>
    sys.exit(main())
  File "/mnt/shared/scratch/awebb/apps/conda/envs/ENSEMBL_submission_env/lib/python3.10/site-packages/EMBLmyGFF3/EMBLmyGFF3.py", line 1361, in main
    for record in GFF.parse(infile, base_dict=seq_dict):
  File "/mnt/shared/scratch/awebb/apps/conda/envs/ENSEMBL_submission_env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 793, in parse
    for rec in parser.parse_in_parts(gff_files, base_dict, limit_info,
  File "/mnt/shared/scratch/awebb/apps/conda/envs/ENSEMBL_submission_env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 337, in parse_in_parts
    cur_dict = self._results_to_features(cur_dict, results)
  File "/mnt/shared/scratch/awebb/apps/conda/envs/ENSEMBL_submission_env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 376, in _results_to_features
    base = self._add_parent_child_features(base, results.get('parent', []),
  File "/mnt/shared/scratch/awebb/apps/conda/envs/ENSEMBL_submission_env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 448, in _add_parent_child_features
    child_feature = self._get_feature(child_dict)
  File "/mnt/shared/scratch/awebb/apps/conda/envs/ENSEMBL_submission_env/lib/python3.10/site-packages/BCBio/GFF/GFFParser.py", line 591, in _get_feature
    new_feature = SeqFeature.SeqFeature(location, feature_dict['type'],
TypeError: SeqFeature.__init__() got an unexpected keyword argument 'strand'

General info:

the commands:

EMBLmyGFF3 myfile.gff3 assembly_v2.fasta \
--expose_translations \
-d WGS \
--locus_tag ANON01 \
--molecule_type "genomic DNA" \
--project_id PRJEB01234 \
--transl_table 1 \
--topology linear \
--species 01234 \
--de 'foo' \
--rg 'bar' \
--ra 'AB, CD, EF, GH' \
--rl 'unpublished' \
--rt 'Annotated draft genome assembly of an anonymous organism.' \
--translate \
--isolation_source 'some plant' \
--isolate 'isolate1' \
--output /path/to/output/assembly1_v2.embl

The gff3 file was generated by EVidenceModeler and incomplete genes were removed by AGAT.

head -20 myfile.gff3
##gff-version 3
Segkk0  EVM     gene    32278   33867   .       +       .       ID=evm.TU.Segkk0.1;Name=EVM prediction Segkk0.1;gene_id=evm.TU.Segkk0.1
Segkk0  EVM     transcript      32278   33867   .       +       .       ID=evm.model.Segkk0.1;Parent=evm.TU.Segkk0.1;Name=EVM prediction Segkk0.1;transcript_id=evm.model.Segkk0.1
Segkk0  EVM     exon    32278   33079   .       +       .       ID=evm.model.Segkk0.1.exon1;Parent=evm.model.Segkk0.1;exon_id=evm.model.Segkk0.1.exon1
Segkk0  EVM     exon    33227   33867   .       +       .       ID=evm.model.Segkk0.1.exon2;Parent=evm.model.Segkk0.1;exon_id=evm.model.Segkk0.1.exon2
Segkk0  EVM     CDS     32278   33079   .       +       0       ID=cds.evm.model.Segkk0.1;Parent=evm.model.Segkk0.1
Segkk0  EVM     CDS     33227   33867   .       +       2       ID=cds.evm.model.Segkk0.1;Parent=evm.model.Segkk0.1
Segkk0  EVM     start_codon     32278   32280   .       +       0       ID=start_added-1;Parent=evm.model.Segkk0.1
Segkk0  EVM     stop_codon      33865   33867   .       +       0       ID=stop_added-1;Parent=evm.model.Segkk0.1
Segkk0  EVM_elm gene    50722   50817   .       -       .       ID=evm.TU.Segkk0.2;Name=EVM prediction Segkk0.2;gene_id=evm.TU.Segkk0.2
Segkk0  EVM_elm transcript      50722   50817   .       -       .       ID=evm.model.Segkk0.2;Parent=evm.TU.Segkk0.2;Name=EVM prediction Segkk0.2;transcript_id=evm.model.Segkk0.2
Segkk0  EVM_elm exon    50722   50817   .       -       .       ID=evm.model.Segkk0.2.exon1;Parent=evm.model.Segkk0.2;exon_id=evm.model.Segkk0.2.exon1
Segkk0  EVM_elm CDS     50722   50817   .       -       0       ID=cds.evm.model.Segkk0.2;Parent=evm.model.Segkk0.2
Segkk0  EVM_elm start_codon     50815   50817   .       -       0       ID=start_added-12;Parent=evm.model.Segkk0.2
Segkk0  EVM_elm stop_codon      50722   50724   .       -       0       ID=stop_added-12;Parent=evm.model.Segkk0.2
Segkk0  EVM     gene    51489   52223   .       -       .       ID=evm.TU.Segkk0.3;Name=EVM prediction Segkk0.3;gene_id=evm.TU.Segkk0.3
Segkk0  EVM     transcript      51489   52223   .       -       .       ID=evm.model.Segkk0.3;Parent=evm.TU.Segkk0.3;Name=EVM prediction Segkk0.3;transcript_id=evm.model.Segkk0.3
Segkk0  EVM     exon    51489   52223   .       -       .       ID=evm.model.Segkk0.3.exon1;Parent=evm.model.Segkk0.3;exon_id=evm.model.Segkk0.3.exon1
Segkk0  EVM     CDS     51489   52223   .       -       0       ID=cds.evm.model.Segkk0.3;Parent=evm.model.Segkk0.3
Segkk0  EVM     start_codon     52221   52223   .       -       0       ID=start_added-23;Parent=evm.model.Segkk0.3
(...)

I can't quite work out what in there triggers the error.

webbchen commented 7 months ago

Update: downgrading biopython to version 1.78 made the error go away.

Juke34 commented 7 months ago

Thank you for your feedback. What was your previous biopython version?

webbchen commented 7 months ago

It was v. 1.81. Had installed it with conda on 10 Jan. 2024.