ThomasDOtto / ratt

Rapid Annotation Transfer Tool
GNU General Public License v3.0
8 stars 4 forks source link

Embl2Fasta creates non-unique sequence names for embl files with multiple sequences #9

Open 0xaf1f opened 4 years ago

0xaf1f commented 4 years ago

when using transfer type Multiple, when one of the reference embls contains two sequences (like a chromosome and a plasmid), I get the error

ERROR: The reference file may contain sequences with non-unique
       header Ids, please check your input files and try again
ERROR: postnuc returned non-zero

Looking at the code for the file being passed to nucmer, as well as the Embl2Fasta function in main.ratt.pl, it's clear that every sequence in the embl file is given the same fasta header. I have not looked into whether patching the function to make the names unique here will cause problems for the way RATT handles the matches at later stages (like retrieving the annotations later on).

haessar commented 1 year ago

Assuming a switch to Bioperl might also fix this? (refer to #12)