aestream / aestream

Efficient streaming of sparse event data supporting files, network I/O, GPU peripherals (via Torch/Jax/Numpy) and neuromorphic protocols
https://aestream.github.io/aestream
MIT License
65 stars 11 forks source link

Implement async threadpool #47

Open Jegp opened 1 year ago

Jegp commented 1 year ago

The coroutine primitive is running concurrently -- for now. It would be ideal to implement a parallel threadpool to schedule event processing in parallel. That would allow us to run much more complex workflows efficiently, and probably get better performance.

See e.g. https://github.com/lewissbaker/cppcoro#async_generatort

cantordust commented 1 year ago

Very interesting! A million years ago I put together a simple threadpool that had some nice features (such as dynamic resizing). I have been thinking about refactoring it to take advantage of coroutines. One feature that might be beneficial here is that it is a header-only library (and I would like to keep it that way if possible). I'd be happy to work on it if you think it might be useful!

Jegp commented 1 year ago

That would be tremendously helpful. First of, I'd be curious to see what the actual effect is. How much performance could this give us? If the answer is >1x, then we have a perfect case for a super clean coroutine-based async interface. That would be excellent for just piping events (like we do in AEStream), but perhaps even for more intricate processing because we avoid stuff like numpy arrays etc. - we'd simply process events one by one ❤️

Can I help somehow here? Would it make sense to schedule a call to chat about the architecture?

I don't have strong opinions regarding header-only, so fine by me to go that direction.

cantordust commented 1 year ago

All very good points. I am not very well versed in the art of measuring performance in a multi-threaded environment, but I am keen to learn. At any rate, I think that a generic coroutine-based threadpool would be immensely useful anyway for all sorts of scenarios, so maybe we could keep them as separate projects and include the threadpool as a git submodule (should be trivial, especially if it's a header-only library).

Let's schedule a dedicated call about this!

Jegp commented 1 year ago

I think that a generic coroutine-based threadpool would be immensely useful anyway for all sorts of scenarios, so maybe we could keep them as separate projects and include the threadpool as a git submodule (should be trivial, especially if it's a header-only library).

I think that's a great idea. The current implementation also uses a type parameter, so it should generalize well. Unfortunately, I have to be honest and say that I won't have much time to invest in developing a generic library. But I would love to provide input, feedback, and integrate it into AEStream.

cantordust commented 1 year ago

No worries, that would be a good incentive for me to dust off the old code. It would probably still make sense to chat briefly about what features you expect from the threadpool so I can make sure to implement them.