splatlab / squeakr

Squeakr: An Exact and Approximate k -mer Counting System
BSD 3-Clause "New" or "Revised" License
85 stars 23 forks source link

endless loop in /src/count.cc #42

Open t-kranz opened 4 years ago

t-kranz commented 4 years ago

/src/count.cc gets stuck in a livelock when num_files exceeds the ip_files queue node limit (or it's element count).

In detail:

The queue node limit for ip_files is hard coded at l.60

The queue and num_files gets populated in the for loop at l.286, but the return value of ip_files.push() (l.296) isn't evaluated (same for the push at l.211 btw).

So if the node limit gets exceeded (or the push is failing for any other reason) the file pointers get dropped silently but num_files still gets incremented (l.297).

Which makes the outer while loop at l.208 an endless loop (since it iterates over num_files, which gets decremented in the inner while loop, which iterates over the elements of ip_files).

Maybe i missed it, but neither - the silent dropping of files and/or filesparts (which from my understanding can happen at l.299 aswell - just without triggering the livelock), nor the existing node limit seems to be documented.

Increasing the node limit to exceed my particular num_file count prevented the endless loop (i guess setting the node limit dynamically to opts.filenames.size() would be the most elegant way, but it seems boost::lockfree::queue's lockfree behavior depends on the disabled dynamic memory allocation.

Sincerely tkranz

t-kranz commented 4 years ago

I append a zip containing a git patch (fix#42.patch). With those changes the execution stops and an error message gets printed in case the queue limit is exceeded.

fix#42.zip