bg7 / BG7

bacterial genome annotation system
bg7.ohnosequences.com
13 stars 7 forks source link

Update src/com/era7/bioinfo/annotation/gb/ExportGenBankFiles.java #23

Closed trypsin closed 12 years ago

trypsin commented 12 years ago

Locus line has one too few spaces after 'LOCUS' before the ID, so that it does not line up with definition/accession/etc. This causes the files to not load properly into Biopython. Additionally adding the linear/circular annotation after the molecule type (DNA) is not genbank-spec, and causes a warning.

pablopareja commented 12 years ago

Hi Jonathan,

Thanks so much for your contribution. Big or small, contributions are always more than welcome and appreciated ;) Sorry for taking so long for replying your e-mail but I have been incredibly busy these past few days. Regarding the pull request I am replying you through the github pull requests section.

Cheers,

Pablo

On Thu, Apr 19, 2012 at 1:08 PM, Jonathan < reply@reply.github.com

wrote:

Locus line has one too few spaces after 'LOCUS' before the ID, so that it does not line up with definition/accession/etc. This causes the files to not load properly into Biopython. Additionally adding the linear/circular annotation after the molecule type (DNA) is not genbank-spec, and causes a warning.

You can merge this Pull Request by running:

git pull https://github.com/trypsin/BG7 patch-1

Or you can view, comment on it, or merge it online at:

https://github.com/bg7/BG7/pull/23

-- Commit Summary --

  • Update src/com/era7/bioinfo/annotation/gb/ExportGenBankFiles.java

-- File Changes --

M src/com/era7/bioinfo/annotation/gb/ExportGenBankFiles.java (2)

-- Patch Links --

https://github.com/bg7/BG7/pull/23.patch https://github.com/bg7/BG7/pull/23.diff


Reply to this email directly or view it on GitHub: https://github.com/bg7/BG7/pull/23

pablopareja commented 12 years ago

hehehe I didn't realize that replying to the notification e-mail was the same thing as replying to the pull request here :) Anyways, I just merged your changes into master. About the warning, do you know what would be a good alternative for that so that it does not cause a warning anymore?

Pablo

trypsin commented 12 years ago

Sorry, I probably would have tried to fix that too, but I was time crunched actually trying to get annotations for a genome. The problem is that the LOCUS line is defined to have the following format

LOCUS ACCENSION_NUMBER LENGTH bp MOLECULE_TYPE TLA DD-MMM-YYYY

In addition to the missing space after locus BG7 inserts 'LINEAR' or 'CIRCULAR' after MOLECULE_TYPE which some parsers (at least BioPython's SeqIO) will complain and be unable to read the TLA for the data_file_division and the date. I'm not completely sure where the best place to put whether it is circular or linear is, probably in the COMMENT field since its not a specified part of the Genbank flat file format.

Jonathan Goodson Lab Manager - Winkler Lab Cell Biology and Molecular Genetics University of Maryland, College Park 301-405-9937

On Apr 24, 2012, at 9:06 AM, Pablo Pareja Tobes wrote:

hehehe I didn't realize that replying to the notification e-mail was the same thing as replying to the pull request here :) Anyways, I just merged your changes into master. About the warning, do you know what would be a good alternative for that so that it does not cause a warning anymore?

Pablo


Reply to this email directly or view it on GitHub: https://github.com/bg7/BG7/pull/23#issuecomment-5303857