tseemann / prokka

:zap: :aquarius: Rapid prokaryotic genome annotation
822 stars 224 forks source link

prokka gbk output Locus line layout error Biopython. using --proteins #533

Closed CarmenSheppard closed 3 years ago

CarmenSheppard commented 3 years ago

Hi,

I would like to openthe gbk files output by prokka in biopython however without using --compliant I get issues with the LOCUS line layout when I try to read the file in SeqIO due to the NODE line clashing with the bp (no space).

ValueError: Did not recognise the LOCUS line layout:
LOCUS       NODE_1_length_191762_cov_13.6631191762 bp   DNA linear

Using --compliant fixes this issue but then it does not use my custom --proteins file to annotate from and the genes I'm trying to find are not annoted correctly.

Is there a way to use --proteins and force the LOCUS line to comply so the gbk file can be opened in BioPython?

CarmenSheppard commented 3 years ago

Found a way around this by clipping node names on assembly to fewer characters which allowas me to run without --compliant and open in Biopython Also now not convinved that --proteins was the issue with the annotation nyway as still have same problem with one of the proteins!