GorNishanov / coroutines-ts

20 stars 2 forks source link

Challenge of shared data between different coroutines and proposal to add coroutine specific shared data infrastructure #20

Open JCYang opened 7 years ago

JCYang commented 7 years ago

Problem: When a user decide that to really multiplex single thread with coroutine and asio, sometimes that required a cooperative operation between different coroutines, so also data/objects to cooperated on. Think about this usage case for example: A regular coroutine designed to download a file from remote server. Let's say it void DownloadFileFromServer(const string file_url, const string saved_path); We know TCP don't have built-in mechanism to detect half-open socket for the users automatically. We need to do the check by sending live detection blob if we decide to detect this sort of problem. So the correct solution is calling another coroutine, void CheckSocketHealth(...) and it internally use a timer to do the live check on a regular basis. I omit the arguments types of CheckSocketHealth() for the sake of the following discussion. Now comes the objects' life time problem. Who(DownloadFileFromServer or CheckSocketHealth) should own the socket to test and the command that direct CheckSocketHealth's execution(when to start check, and when to end the check)? We don't have very elegant solution here. First, for users who are very careful and willing to design his solutions correctly and efficiently, he can use unique_ptr and move the shared data to CheckSocketHealth() then owned by it, through the call site, just keep a reference in DownloadFileFromServer and code its logic carefully to avoid dangling reference, This only incur one heap allocation per object, that's not perfect, but fine. Second, if the users who are not willing to be very careful, he can use shared_ptr, that works as expected. But performance wise, that incur too much. Just as I said, the user is multiplexing a single thread. Though the reference counting is still necessary for this sort of casual usage, but we don't need the even more expensive mutex!

So I propose we should consider a coroutine specific data shared infrastructure to enable more efficient single thread multiplexing usage of coroutine. The ideal solution should allow reference counting or not(maybe we need two different structure) and all shared data should be allocated on the coroutines' state, rather than required another heap allocation, and synchronization should be an optional feature(for single thread multiplexing, it is just nonsense).

Do you think this is a valid proposal? Or maybe we should do better or as good within the current standard?

GorNishanov commented 6 years ago

Coroutines are great if your state machine can be expressed as imperative control flow, (essentially at every state of the state machine you can either go forward or cancel the state machine).

If you can sketch out how your state machine will look like with just asio without coroutines, we can then examine how it can be best expressed as a coroutine.

If you want, add a link to a working example using boost::asio (without coroutines) of what you want want to achieve. I can help translating it to coroutine style.

JCYang commented 6 years ago

I agree without detail codes sometimes the problem can not be described correctly. So here is the skeleton, unimportant details are omit and shared_ptr all the way to simplified the code logics:

https://wandbox.org/permlink/CQdPoiEjkuzwHtSC

Some missed but import codes are, I need only one I/O thread to do g_io_service.run(); and post the download task to this I/O thread. So we're discussing all codes run in the same thread.

JCYang commented 6 years ago

I've no idea whether this is a corner case, we may rarely need such sort of co-operation between different coroutines, I don't know. but it does bring up the challenge to just-pay-what-we-must-pay idiom of C++.

JCYang commented 6 years ago

I can post the skeleton of my own coroutine based implementation if you still think the problem not clear enough,