maikel / senders-io

An adaption of Senders/Receivers for async networking and I/O
Apache License 2.0
14 stars 2 forks source link

Improve time measurement and initialization of read_batched example #57

Closed mfbalin closed 1 year ago

mfbalin commented 1 year ago
  1. Move thread_state construction to thread and use a barrier so that only read time is measured and init is parallelized.

  2. Eliminate the contention in the counters struct.

These two modifications together seem to make the multithreaded time measurement much more stable and performant.

mfbalin commented 1 year ago

I think the current example has another drawback. Each thread is copying the same number of bytes but if the OS schedules them unfairly somehow, there will be a load imbalance, which might affect timing measurements and make it seem like more threads are worse for performance.

maikel commented 1 year ago

I think the current example has another drawback. Each thread is copying the same number of bytes but if the OS schedules them unfairly somehow, there will be a load imbalance, which might affect timing measurements and make it seem like more threads are worse for performance.

These enters the realm of real multi threaded schedulers where you have some form of work stealing as in static_thread_pool. Would you like to contact me via Discord? My user name is maikel.nadolski and there is a executor channel on the #include Server to discuss P2300 related topics and asking for help regarding the stdexec framework.

maikel commented 1 year ago

I've added the memory pool to constrain the number of submitted read operations to each context. That dramatically improves the initial performance since there is no O(N) allocation (and iteration) anymore.