bergmanlab / drosophila-transposons

Drosophila transposable element canonical sequences
Creative Commons Zero v1.0 Universal
24 stars 11 forks source link

Last record of the embl file lacks an "ID" line #16

Closed blaiseli closed 10 years ago

blaiseli commented 10 years ago

It seems that the last record is not formatted like the others: it starts with a line starting with "LOCUS" instead of "ID".

(As a side note, Biopython's SeqIO.parse() seems to ignore the "FT" lines. At least, the records I obtain using this parsing method have an empty "features" list.)

cbergman commented 10 years ago

Thanks for letting me know about this particular record. i am aware that the format of this file has some problems that break EMBL parsers. This is also true for RepBase's EMBL format. One of my goals is to move away from EMBL format and convert this file into GFF3.

cbergman commented 10 years ago

ID line for TAHRE fixed during this commit: https://github.com/cbergman/transposons/commit/f57dff07202adbe8c4eda38467f852f01476083a