[REQ] Add compiler flag/define to make threadpool use a stack instead of queue

MrGroover007 commented 7 months ago

Describe the new feature

Thanks for providing this easy to use thread pool. :) I have some tasks that generate a bunch of results, which are then processed further using the thread pool (1:n relation in each generation). Before some termination criteria hits (e.g. number of generations), exponentially more tasks are created than processed. This results in a large peak memory usage, as the tasks-queue grows quite large. So far, I modified your code to make the tasks-queue use a std::stack instead of std::queue. For me, this results in less memory peak usage since there are not that many pending tasks. Of course, this also changes execution order to LIFO instead of FIFO. I haven't bothered using the optional priority queue since a std::stack is sufficient for my application. This should also have the same performance as a std::queue, compared to the std::priority_queue, which is potentially slower on inserts.

Hence my suggestion is to add an optional define that makes the tasks use a std::stack instead of std::queue. This would require three #ifdefs. Obviously, two would be the include and declaration of the tasks member. The third one is required because std::stack doesn't have a .front() member function - it uses .top instead. Basically, that's it, since STL containers are nicely designed to be interchangeable with each other.

Please find a diff attached. BS_thread_pool.hpp.txt

bshoshany commented 7 months ago

Hi @MrGroover007 and thanks for your suggestion! This seems easy enough to add.

However, I am wondering how much a LIFO queue makes sense for a thread pool. Generally, I would expect the tasks to run in the order of submission, FIFO, with the priority queue option allowing for more fine-tuned control if some tasks are more urgent than others. But a LIFO queue would mean that the tasks I submitted first may never be executed at all if I keep submitting more and more tasks. Can you please elaborate a bit more on what your code does and how a LIFO task queue is suitable for it?

MrGroover007 commented 7 months ago

Hi. My code is kind of a ray tracer. At every interaction, it launches a number of new rays. Some implementations initially launch a huge number of rays and randomly decide at every interaction to trace either the reflected or transmitted (refracted) ray. There is now growth and a FIFO works perfectly fine. My implementation does trace both resulting rays and additionally randomly adds rays in new directions with some form of angular-dependent energy distribution (could e.g. be modelling surface roughness). From physics I "know" that I don't have to trace every interaction and I can have rays die out after a certain number of interactions or if their amplitude falls below a defined limit. So there is at least one criterion to prevent the software from producing new tasks indefinitely. In the end, it results in a finite number of rays to be processed, like in the other mentioned way of implementing things.

Using a FIFO, there are potentially N new rays per interaction and ray of the previous generation (worst case if every ray hits an object). Those have to be stored somewhere, resulting in a large memory consumption growth rate. Processing them in a LIFO reduces the total amount of pending or queued rays at an instance of time (simulation runtime). One could imagine this to be like a tree structure (certainly not binary). To reach every leaf, you can traverse the tree layer by layer or just follow one branch until all leaves have been processed and then return to follow the next branch. In total and depending on parameter selection, my simulations process billions of rays and a LIFO certainly helps to keep memory consumption at a maintainable limit.

bshoshany commented 7 months ago

I see, thanks for the explanation. I am not sure how much the option for a LIFO task queue will be useful to other users, since this seems to be a very specific use case. However, will consider adding this feature in the next release. I want to play around with it first to see how it works compared to a FIFO task queue.

bshoshany / thread-pool

[REQ] Add compiler flag/define to make threadpool use a stack instead of queue #144