linsalrob / PhiSpy

Prediction of prophages from bacterial genomes
MIT License
70 stars 20 forks source link

Choice of the longest repeats is non-deterministic #26

Closed linsalrob closed 4 years ago

linsalrob commented 4 years ago

PhiSpy chooses the longest repeats flanking a prophage region and includes them in the output files. However, the repeats are stored initially as structs and then as members of a set, and then finally iterated through.

The order of the repeats may vary from run-to-run, and if there are two repeats of the same length longer than any other repeats they may be chosen in a non-deterministic way (ie. one may be chosen on the first run, while a different one maybe chosen subsequently).

With version 3.4.7 PhiSpy prints a warning to STDERR if more than one longest repeat is found. A solution for this problem is neither obvious nor trivial to implement.