wwood / finishm

genome improvement and finishing without further sequencing effort
MIT License
5 stars 2 forks source link

Find cyclic paths using SingleCoherentPathsBetweenNodesFinder #13

Closed timbalam closed 9 years ago

timbalam commented 9 years ago

This commit modifies the SingleCoherentPathBetweenNodesFinder algorithm to halt when a number of paths max_num_paths is reached. I have used a two stack implementation: Each possible path is analysed for cycle repeats before being pushed to a stack; if a cycle is repeated more than the allowed max_cycles, it is pushed to a secondary stack, otherwise to the primary stack. The secondary stack is used once the primary stack is empty. The stack size + number of found solutions is compared to max_num_paths and triggers the parachute if it exceeds.

I used this implementation instead of a priority queue as it is simpler while meeting the following goals:

I also created a new class CycleCounter that caches max cycle counts for paths when it computes them. The path finder algorithm makes a lot of redundant calls, so I hope this helps? The cache can get pretty big though. I don't know if that could be a problem.