sanger-pathogens / Roary

Rapid large-scale prokaryote pan genome analysis
http://sanger-pathogens.github.io/Roary
Other
303 stars 189 forks source link

MSG: Got a sequence without letters. Could not guess alphabet #568

Open dhiru16 opened 2 years ago

dhiru16 commented 2 years ago
Picture1
dhiru16 commented 2 years ago

I saw previous issue with same heading, but as opposed to that issue, I am fairly certain that my sequences should be pretty similar to each other.

callaband commented 2 years ago

I am also having this issue.

Bacterial isolate WGS (evolved strains from reference), assembled via unicycler hybrid method, and annotated via prokka - used the output .gff files. One strain's files are working just fine, no error. The other gives me this error no matter what I do. I have examined the .gff files and they basically appear to be the same for both strains as far as I can tell.

The roary run for one strain ends very quickly (<30 min), but I do get all the output files, but there is only shell/cloud genes (no core or soft core). The other strain has been running for 8 hrs now.

Attached examples in .txt format ( .gff format not accepted) example_notworking_strain.txt example_working_strain.txt

The gene annotations don't look like they are populating correctly for the strain. It must be some kind of formatting error from prokka, but I'm not sure what kind and how to fix.