Open timflutre opened 8 years ago
Timothée; Thanks for the report and sorry to be slow in getting back with you. This script only handles formatting conversion -- taking what is in the GenBank file and converting over to GFF format. It doesn't try to do the work of massaging naming of attributes to match between the two. The code really only uses the internal Biopython representation to do the conversion between a shared object, but doesn't have any special cases. I'd be happy to accept a pull request to do that, or it's something you could ask about supporting at gffutils if you'd like a more forward-looking approach. gffutils has some support for Biopython interoperability now, although would also need work to handle these specific naming conversion cases as well.
Sorry to not have a ready solution for you but hope this helps.
Ok, thanks for getting back to me.
Hi Timothée, have you tried annotwriter?
Thanks @lcscs12345 , I'll have a look!
I am trying to convert a Genbank file to GFF3 following the latest version of the official specification. Here is an example of Genbank file I need to convert: ftp://ftp.ncbi.nlm.nih.gov/genomes/Vitis_vinifera/ARCHIVE/BUILD.1.1/CHR_01/vvi_ref_chr1.gbs.gz
The script
genbank_to_gff.py
works but writedb_xref
instead ofDbxref
. Same fornote
instead ofNote
. I also have other issues, e.g. exon being encoded as "feature mRNA", etc.I can see that you often advise people to look at gffutils. But it doesn't handle the Genbank format. So should I start looking at your BCBio code? Is there any chance to include it at some point in Biopython?