tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
843 stars 226 forks source link

strange characters in intermediate files lead to failure in generating gbf output #603

Open Yasana1990 opened 2 years ago

Yasana1990 commented 2 years ago

Hello!

I used prokka for annotation of a genome. Here is my command:

prokka 1.fna --addgenes --addmrna --cpus 16 --genus Wolbachia --usegenus --species pipientis --strain wMelCS

It yields the well-known (#179) problem: [16:06:23] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14.6 from https://github.com/tseemann/prokka' -Z PROKKA_12072021\/PROKKA_12072021\.err -i PROKKA_12072021\/PROKKA_12072021\.fsa 2> /dev/null [16:06:23] Deleting unwanted file: PROKKA_12072021/errorsummary.val [16:06:23] Deleting unwanted file: PROKKA_12072021/PROKKA_12072021.dr [16:06:23] Deleting unwanted file: PROKKA_12072021/PROKKA_12072021.fixedproducts [16:06:23] Deleting unwanted file: PROKKA_12072021/PROKKA_12072021.ecn [16:06:23] Deleting unwanted file: PROKKA_12072021/PROKKA_12072021.val [16:06:23] Repairing broken .GBK output that tbl2asn produces... [16:06:23] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < PROKKA_12072021\/PROKKA_12072021\.gbf > PROKKA_12072021\/PROKKA_12072021\.gbk sh: 1: cannot open PROKKA_12072021/PROKKA_12072021.gbf: No such file [16:06:23] Could not run command: sed 's/COORDINATES: profile/COORDINATES:profile/' < PROKKA_12072021\/PROKKA_12072021\.gbf > PROKKA_12072021\/PROKKA_12072021\.gbk

Digging deep into PROKKA_12072021\/PROKKA_12072021.fsa reveals that it contains lines like that:

>JACSNK010000002.1 [gcode=11] [organism=Wolbachia pipientis] [strain=wMelCS^M]

Notice '^M' character, which is apparently a carriage return symbol. There are no '^M's in 1.fna I must admit! So this "carriage return"s were generated while performing the PROKKA routine. I played around with the options and found out that the output for just

prokka 1.fna --addgenes --addmrna --cpus 16 --genus Wolbachia --usegenus

seems to be fine. Annotation finished successfully. So it seems to be an issue of specifying the strain and/or species and should be considered a bug. I hope it will be addressed in the future versions of PROKKA. Thank you for a very useful software, by the way!