Allow TLS negotiation on a multi-threaded executor.

jothan commented 1 month ago

I had to take the RefCell out, because wait() would await holding &RefCell, making the futures require Sync.

It was replaced by a Mutex, since the socket read/write splitting design is not clear yet.

ivmarkov commented 1 month ago

I had to take the RefCell out, because wait() would await holding &RefCell, making the futures require Sync.

It was replaced by a Mutex, since the socket read/write splitting design is not clear yet.

First: a diversion regarding borrowing/locking across await points:

I don't think we are actually keeping a borrow of the RefCell across await points. Clippy would've complained if that's the case, and really, where do you see that? Also of note - putting a blocking Mutex (by the way - if you need a blocking mutex and you want to stay no_std compatible, you can use the one in esp_idf_svc::private::mutex) does not solve this problem (need to keep a lock during awaits) but makes it worse of course, as now you might have deadlocks instead of panics, in data race cases.

Even an async Mutex won't solve the "issue" in our particular case, precisely because of socket splitting: putting an async mutex around EspTls would prevent the reader and writer from working concurrently, as the first one to enter the async mutex (say, the reader) would prevent the other one (say, the writer) to poll for "I can write" until the reader exits the async mutex.

Really, the only proper way to do locks in the presence of socket splitting is to use RefCell borrows (single-threaded) or blocking mutex locks (multi-threaded) and ensure you don't keep the lock/borrow across await points.

Regardless, the above issue is I hope moot, simply because I don't believe we are holding a lock (a borrow, actually) across await points in the first place! We only do ii for a short glimpse when reading/writing data from/to the socket, when data is already reported to be available, as it should be.

Second: why I think UnsafeCell is not a solution and might blow up (please read carefully, I might be wrong! :) ):

A multi-threaded executor wants futures to be Send, so that it can do the first poll of the future on one thread, then the second poll of the future on potentially a different thread and so on.

Looking at the signature of - say - async fn read(&self, buf: &mut [u8]) -> Result<..> - this is just sugar for fn read<'t>(&'t self, buf: &mut [u8]) -> impl Future<Output = Result<..>> + 't (where 't is very important).

So if the future needs to be Send, then &'t self needs to be Send too, as it is contained inside the future (that's why the impl Future<...> has + 't). But then if &'t self needs to be Send, then self itself needs to be Sync! Hence you actually DO need a blocking mutex around EspTls.

A concrete example why EspTls needs to be mutexed (and thus Sync) would be a "read" future running in one thread, and then a "write" future running in another. In that case, they both need concurrent access to EspTls, hence access to EspTls needs to be synchrionized.

Please let me know if you see gaps in my thinking!

ivmarkov commented 1 month ago

By the way... why are you running a multi-threaded executor in embedded context in the first place? I kind of miss the point? And there's tokio's LocalSet and async-executor's LocalExecutor anyway...

jothan commented 1 month ago

@ivmarkov Thanks for the detailed feedback, you made a very good point about Sync. I was in bed when I realized I had some unsoundness in what I submitted and put this PR in draft once more. I reached for UnsafeCell somewhat in desperation for the no_std case. Having RefCell inside of EspAsyncTls instead of around it annoyed me to no end.

For the multi-threaded executor, I'd simply like my web server to be able to take advantage of the two cores for serving files over separate connections. I'm using smol's Executor and LocalExecutor. One thread has a local executor, and both threads execute from the global work-stealing Executor. In a lot of regards, I am still in the exploratory phase of my project and I am interested by what can be done.

ivmarkov commented 1 month ago

@ivmarkov Thanks for the detailed feedback, you made a very good point about Sync. I was in bed when I realized I had some unsoundness in what I submitted and put this PR in draft once more. I reached for UnsafeCell somewhat in desperation for the no_std case. Having RefCell inside of EspAsyncTls instead of around it annoyed me to no end.

OK, I'll merge your change shortly. What's a bit annoying (but only so slightly) is that we'll now take the mutex hit even if we don't plan to use the EspAsyncTls in a multi-threaded context. But that should be a negligible overhead, and I guess I can make a generic version of the private mutex that the user can specify to be not Sync when instructed and would use a no-op mutex (but that's future and not so important).

For the multi-threaded executor, I'd simply like my web server to be able to take advantage of the two cores for serving files over separate connections. I'm using smol's Executor and LocalExecutor. One thread has a local executor, and both threads execute from the global work-stealing Executor. In a lot of regards, I am still in the exploratory phase of my project and I am interested by what can be done.

I hope you are aware that the total number of sockets that can be opened via ESP IDF is ~ 10. With some persistence and convincing you can raise this number to 16, which is then a hard limit. Just mentioning, in case you were planning to support a much higher number of connections and thus a heavier workload that does need two cores...

ivmarkov commented 1 month ago

... and I haven't checked, but I do believe sockets that you accept from a server-side socket do count to that number...

jothan commented 1 month ago

Thanks for the merge and the heads up about the connection limits. I'm not planning on using too many concurrent connections.

esp-rs / esp-idf-svc

Allow TLS negotiation on a multi-threaded executor. #432