Closed alt-graph closed 7 months ago
I'm not sure why we need to self-roll this. Why not simply
https://www.boost.org/doc/libs/1_84_0/doc/html/boost_asio/reference/thread_pool.html
We use (of course) boost::asio
for the serial communication for example. The boost thread_pool is also based on asio. No need to develop and debug if there is a battle tested alternative so readily available. Boost also has fibers, if that suits better than threads.
I'm not sure why we need to self-roll this. Why not simply
https://www.boost.org/doc/libs/1_84_0/doc/html/boost_asio/reference/thread_pool.html
You may be right, but so far we avoid boost dependencies in the DOOCS core libs, and I am not too enthusiastic about opening that can of worms – again, because we had that discussion already some years ago.
This ThreadPool
also has some features that are not in boost, like scheduling tasks for a delayed start. Also the task names are something that I have not seen in any other implementation, but they are invaluable for debugging if you have a task with a deadlock. The implementation here went from being in one server to being copied around to a handful to being a part of HLC Util to the current rework for GUL. What I am trying to say is: It has already seen quite a bit of usage and would immediately be useful as-is.
You may be right, but so far we avoid boost dependencies in the DOOCS core libs, and I am not too enthusiastic about opening that can of worms – again, because we had that discussion already some years ago.
I would also avoid adding additional dependencies to the core libraries. Non the less there are a lot of thread pool libraries out there.
This
ThreadPool
also has some features that are not in boost, like scheduling tasks for a delayed start. Also the task names are something that I have not seen in any other implementation, but they are invaluable for debugging if you have a task with a deadlock. The implementation here went from being in one server to being copied around to a handful to being a part of HLC Util to the current rework for GUL. What I am trying to say is: It has already seen quite a bit of usage and would immediately be useful as-is.
But this raises the question, do you plan to use the ThreadPool
in the DOOCS core libraries and outside of the core libraries, so that we need them in gul?
I'm not per se against adding it, but I also don't wanna see gul growing in a full-fletched framework.
I would suggest to inline, meaning move the implement from the cpp into the header, for all the the one-two-line (like count_threads
, count_pending
etc.) methods. This would allow for better optimization without the need of LTO been enabled.
But this raises the question, do you plan to use the
ThreadPool
in the DOOCS core libraries and outside of the core libraries, so that we need them in gul?
We already use it outside, and I am playing with the idea of adding a small thread pool to the DOOCS-over-ZeroMQ asynchronous receiver. Since our main model for subscriptions is user callbacks, it might make sense to distribute the callbacks over a thread pool for high load scenarios. ZeroMQ is blazingly fast in receiving messages, but user callbacks may not always be able to keep up. I am not sure if we will really need this, but the new archiver might be a use case.
I'm not per se against adding it, but I also don't wanna see gul growing in a full-fletched framework.
I agree, but for me a thread pool/task queue has a similar level of utility as the SlidingBuffer
. I frankly don't want to live without either anymore.
I believe this is ready for review now. The documentation looks OK, and the tests seem to pass.
Hold on, I guess I'll move the std::invoke
-related things into their own PR and squash some of the other commits. Otherwise this is just too much...
OK, now this looks more reviewable than before. I recommend reviewing the entire diff rather than individual commits, though. If anyone prefers I can move the Meson changes (for building unit tests with static libs) into a separate PR.
Force-pushed to resolve a merge conflict. This is still in need of a review; please let me know if you prefer a squashed version.
I will review today.
Suggested Patch 2, maybe include something similar
(This does not fix the examples' code)
From bdbc964d0664229f97bf32cd1aa5b62d85d21a90 Mon Sep 17 00:00:00 2001
From: Fini Jastrow <ulf.fini.jastrow@desy.de>
Date: Tue, 27 Feb 2024 02:58:10 +0100
Subject: [PATCH 2/2] CI: Try to compile examples
Signed-off-by: Fini Jastrow <ulf.fini.jastrow@desy.de>
---
.github/workflows/testposix.yml | 3 +++
examples/meson.build | 9 +++++++++
meson.build | 1 +
3 files changed, 13 insertions(+)
create mode 100644 examples/meson.build
diff --git a/.github/workflows/testposix.yml b/.github/workflows/testposix.yml
index 085a57f..308ed10 100644
--- a/.github/workflows/testposix.yml
+++ b/.github/workflows/testposix.yml
@@ -64,6 +64,9 @@ jobs:
- name: Build and run ASAN tests
run: ./meson.py test -C build.asan
+ - name: Build exampes
+ run: ninja -C build.asan examples/thread_pool_example examples/trim_example
+
- name: Build and run release tests
run: ./meson.py test -C build.release
diff --git a/examples/meson.build b/examples/meson.build
new file mode 100644
index 0000000..b4f4f9c
--- /dev/null
+++ b/examples/meson.build
@@ -0,0 +1,9 @@
+executable('trim_example', 'trim.cc',
+ dependencies : [ libgul_static_dep ],
+ build_by_default : false,
+)
+
+executable('thread_pool_example', 'thread_pool.cc',
+ dependencies : [ libgul_static_dep ],
+ build_by_default : false,
+)
diff --git a/meson.build b/meson.build
index 4d923e5..f299659 100644
--- a/meson.build
+++ b/meson.build
@@ -25,6 +25,7 @@ subdir('tests')
message('Install prefix: ' + get_option('prefix'))
subdir('debian')
+subdir('examples')
message('LIBGUL_API_VERSION @0@'.format(libgul_api_version))
message('GIT FULL API VERSION @0@'.format(git_full_api_version))
--
2.40.1
Patch 1: How to make one object out of ThreadPool and ThreadPoolEngine
(Rough, just to show the idea)
The engine_
member must come last of all data member, maybe that needs a comment in the code. It does in the diff but is easily forgotten with time.
Edit: Add two 'paragraphs' above Edit: Hide diff in details
This looks really nice!
One thing that could be improved (later) is that it spawns all threads immediately. Maybe switching to a lazy thread creation is worthwhile? Like: You want max 20 threads, and then see how many connections come in, and create the 3 threads that we need on demand. If there are never more than 3 concurrent things to do, why have all the threads idling?
Only reason I can think of is to reduce latency when number of threads increases. But that might be a special case, that can be solved by the min_threads
parameter additionally to the num_threads
which would be max_threads
then.
One could even implement a shrink_to_fit()
that ends all unused threads at user will.
/spent 4h30m
:woman_facepalming: My test was broken. You just get, like on a regular future...
But is that a good solution? Maybe there is no other way to check that, technically.
Edit: Yes, I believe the way it is is the only way possible, so :+1:
I have not a way to find out if a Task that does not return anything finished successfully or finished because it threw an exception. Is there a way?
auto task2 = pool.add_task([]() { sleep(1); throw std::runtime_error{"Wanna sleep"}; });
task2.is_complete()
turns true
and ... how do I find out if it finished ok or with error?
There is the get_id()
member - which I would remove as it just needlessly exposes a implementation detail, but there is no way to access the future
and so there is no way to check a void
task for successful-termination?
Maybe it should be possible to use task2.get_result()
which is currently not possible.
Or is_complete()
needs to be tri-state?
Suggested Patch 2, maybe include something similar
Great idea, applied. Oh – CI errors. Will investigate.
Great idea, applied. Oh – CI errors. Will investigate.
That immediately found another bug in the trim()
example. Fixed.
Patch 1: How to make one object out of ThreadPool and ThreadPoolEngine
(Rough, just to show the idea)
The
engine_
member must come last of all data member, maybe that needs a comment in the code. It does in the diff but is easily forgotten with time.
I am not entirely sure if I got the gist of your patch, but for me it looks like it boils down to this:
engine_.reset(this, [](ThreadPool*){});
This stores the address of the existing ThreadPool
into a shared_ptr
, which is fine. But TaskHandle
still has no safe way of accessing the ThreadPool
object. It can call weak_ptr::expired() to find out if the pool is already gone, but it can not keep it alive via lock() to actually do anything with it. Nothing keeps the original ThreadPool
alive unless it was actually allocated inside a shared_ptr
, hence the ThreadPoolEngine
.
One thing that could be improved (later) is that it spawns all threads immediately. Maybe switching to a lazy thread creation is worthwhile? Like: You want max 20 threads, and then see how many connections come in, and create the 3 threads that we need on demand. If there are never more than 3 concurrent things to do, why have all the threads idling?
Only reason I can think of is to reduce latency when number of threads increases. But that might be a special case, that can be solved by the
min_threads
parameter additionally to thenum_threads
which would bemax_threads
then.One could even implement a
shrink_to_fit()
that ends all unused threads at user will.
I thought about something like that, too. Adding threads on-the-fly is very easy, but shrinking the pool requires joining the superfluous threads, which can only be done after they finished their work. It gets very messy very soon. I wouldn't discard the idea, but it requires some careful design.
Also, the idling threads do not do any harm except for consuming a few megabytes of memory (and cluttering the view in a debugger). The system scheduler should make sure that they do not consume any CPU cycles because they are blocked on the CV.
BAH, I seem to be unable to answer to comments in a thread if they are direct comments.
I overlooked that :grimacing: (i.e. has no safe way of accessing). I still think it feels clunky.
What do you think about having a factory function instead? Like (after unifying the Engine back into the Pool)
auto my_pool = gul14::get_thread_pool();
// with
auto get_thread_pool() -> std::shared_ptr<ThreadPool>;
At the moment ThreadPool
is just a facade to a shared_ptr<ThreadPoolEngine>
that forwards all calls.
If I have not overlooked something :grimacing: that would be a comparable solution with the same lock-ability, bujt directly and not with an intermediary and no two 'confusing' types.
See also https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#cp61-use-async-to-spawn-concurrent-tasks Edit: The link is not really related, just that it also uses a factory
Well, you seem to think about having 10 threads or something. But the actual limit max_threads
is 10'000. I believe that many threads are not a good idea and one should use fibers or similar for such use-cases; less heavy weight mechanisms. I would not play that down (in the arguments) ('idling threads do no harm').
Adding threads on-the-fly is very easy
Then, maybe, we should add that right away? Lazy thread creation does not need to be tied to thread removal, that are two distinct features (in the same direction: Try to reduce unneeded resource use). This does not have to be within this PR. But why not add that, if it is easy?
but shrinking the pool requires joining the superfluous threads, which can only be done after they finished their work
That is why I proposed a shrink_to_fit()
that just joins all the threads that are idle at the moment. That should also be rather easy to implement. Set some 'you need to die' flag(s) and wake the thread(s) up, then join them away.
Edit: To be explicit: I believe we should continue without the Lazy stuff and get the PR merged, and possibly continue with the Lazy side of life in a new PR afterwards.
What do you think about having a factory function instead? Like (after unifying the Engine back into the Pool)
auto my_pool = gul14::get_thread_pool(); // with auto get_thread_pool() -> std::shared_ptr<ThreadPool>;
At the moment
ThreadPool
is just a facade to ashared_ptr<ThreadPoolEngine>
that forwards all calls. If I have not overlooked something 😬 that would be a comparable solution with the same lock-ability, bujt directly and not with an intermediary and no two 'confusing' types.
I also find the two types very confusing, and they duplicate too much code, so :+1: for the factory function idea. I'll try and implement that. Warning: I will probably make the ThreadPool
constructor private again. :-D
I believe we should continue without the Lazy stuff and get the PR merged, and possibly continue with the Lazy side of life in a new PR afterwards.
I have opened an issue (#80) for that.
What do you think about having a factory function instead?
It is now implemented like that. ThreadPoolEngine
is gone, we need to call make_thread_pool()
instead.
Thanks for the review, @Finii! I think that both architecture and usability are much better than in the original version.
What I do not understand is why we have
* `gul14::TaskState` but * `gul14::ThreadPool::TaskHandle`
The TaskHandle
needs to access a private member of the ThreadPool
, and keeping it separate leads to a mess of circular dependencies between the two classes. All of that can be solved, but is very messy. The nested class was by far the best solution I found.
TaskState
could go anywhere, but since this is a scoped enum, people have to write the full type name a lot:
if (handle.get_state() == ThreadPool::TaskState::complete) ...;
if (handle.get_state() == TaskState::complete) ...;
I just prefer the second one.
This is an implementation of a
ThreadPool
class. We have some similar code in use in our HLC libs across multiple servers.Some notes:
Closes #28