lmdu / krait

An ultrafast tool for genome-wide survey of microsatellites and primer design
http://krait.biosv.com
GNU Affero General Public License v3.0
35 stars 9 forks source link

Error with erroneous sequence data #11

Closed Rhynchites closed 4 years ago

Rhynchites commented 4 years ago

When searching for SSRs in a large fasta file (with more than 250 000 sequences) I get the following error:

image

This is apparently triggered by erroneous sequence data, so far I traced back two sequences that did this, see attached files.

NODE_12838_length_70_cov_23.79.txt NODE_42629_length_65_cov_2.692.txt

mong222 commented 4 years ago

I also have the same error. Could you tell me how to remove the erroneous data?

Rhynchites commented 4 years ago

Hi, Unfortunately I haven't found a way to remove the problematic sequence data in an efficient way yet. It took me more than 2 hours to find those two sequences triggering the error.

It would be great if there was an option to ignore/skip those sequences, without having a crash of Krait.

lmdu commented 4 years ago

Thank you for reporting this bug. I will fix it as soon as possible!

lmdu commented 4 years ago

A new version was released to fix this bug. It was caused by sequence without G or C base, when counting the bases.

mong222 commented 4 years ago

I tried the new version, but it crashed without errors when searching for SSRs in a large fasta file obtained from NCBI (more than 300 000 sequences).

Rhynchites commented 4 years ago

The new version of Krait ran my file without problems this time. Thanks for fixing this!