CompSynBioLab-KoreaUniv / FunGAP

FunGAP: fungal Genome Annotation Pipeline
108 stars 32 forks source link

Biopython problems #80

Open Hoberti opened 2 years ago

Hoberti commented 2 years ago

Hi,

I find this when I run run_braker

getAnnoFastaFromJoingenes.py -g maker_out/masked_assembly.fasta.adjusted -o braker_out/308/braker_308 -t 1 -3 braker_out/308/braker_308.gff3

anaconda3/envs/fungap/bin/getAnnoFastaFromJoingenes.py Traceback (most recent call last): File "/anaconda3/envs/FUNGAP/bin/getAnnoFastaFromJoingenes.py", line 35, in from Bio.Alphabet import generic_dna, generic_protein File "anaconda3/envs/FUNGAP/lib/python3.8/site-packages/Bio/Alphabet/init.py", line 20, in raise ImportError( ImportError: Bio.Alphabet has been removed from Biopython. In many cases, the alphabet can simply be ignored and removed from scripts. In a few cases, you may need to specify the molecule_type as an annotation on a SeqRecord for your script to work correctly. Please see https://biopython.org/wiki/Alphabet for more information.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "anaconda3/envs/FUNGAP/bin/getAnnoFastaFromJoingenes.py", line 39, in raise ImportError( ImportError: Failed to import biophython modules. Try installing with "pip3 install biopython"

I find that biopython==1.79 has no more the Bio.Alphabet module, this could easily solucionated when installing biopython with pip: pip install biopython==1.75

Is this correct?

Héctor

mbnmbn00 commented 2 years ago

Hello,

Which Braker and Augustus versions are you using?

braker.pl --version
braker.pl version 2.1.5
augustus --version
AUGUSTUS (3.4.0) is a gene prediction tool.
Sources and documentation at https://github.com/Gaius-Augustus/Augustus

In my /home/ubuntu/anaconda3/envs/fungap/bin/getAnnoFastaFromJoingenes.py script, there's no Bio.Alphabet.

Its header looks like this:

#!/usr/bin/env python3

# Author: Katharina J. Hoff
# E-Mail: katharina.hoff@uni-greifswald.de
# Last modified on August 26th 2019
#
# This Python script extracts CDS features from a GTF file, excises
# corresponding sequence windows from a genome FASTA file, stitches the
# codingseq parts together, adds letters N at the ends if bases are
# annotated as missing by frame in the GTF file, makes reverse complement
# if required, and translates to protein sequence.
# Output files are:
#    * file with protein sequences in FASTA format,
#    * file with coding squences in FASTA format
# The script automatically checks for in-frame stop codons and prints a
# warning to STDOUT if such genes are in the GTF-file. The IDs of bad genes
# are printed to a file bad_genes.lst. Option -s allows to exclude bad genes
# from the FASTA output file, automatically.
# Beware: the script assumes that the gtf input file is sorted by coordinates!