JuliaComputing / gRPCClient.jl

A Julia gRPC Client
Other
48 stars 3 forks source link

WIP: add curl share locks to make things threadsafe #18

Open tanmaykm opened 3 years ago

tanmaykm commented 3 years ago

ref: https://github.com/JuliaLang/Downloads.jl/issues/110

tanmaykm commented 3 years ago

Just adding the curl locks don't seem to make things thread safe. The CI environment is actually single threaded, so best to run the tests locally to see what's failing.

Multiple tasks run fine though. And in the threaded tests each thread had its own gRPCChannel with it's own Multi instance. That makes it a bit more puzzling.

c42f commented 3 years ago

I had a quick look through here - it does all seem to make sense in terms of curl_share_setopt usage.

Do we call curl_global_init explicitly somewhere? https://curl.se/libcurl/c/threadsafe.html points out that this is necessary.

Each thread had its own gRPCChannel with it's own Multi instance. That makes it a bit more puzzling.

Yes this does seem puzzling. LibCurl's own documentation suggests that this should be fine.

tanmaykm commented 3 years ago

I see curl_global_init being done in Downloads.jl here.

c42f commented 3 years ago

Looks like https://github.com/JuliaLang/Downloads.jl/issues/151 should (?) have fixed some of the immediate upstream issues.

tanmaykm commented 3 years ago

Awesome! Seems to work fine with https://github.com/JuliaLang/Downloads.jl/pull/151 :tada: :+1: Will clean this branch up and test a bit more.

tanmaykm commented 3 years ago

Was able to test gRPCClient on Julia 1.5 and Downloads master quite a bit. Tests pass when I use a per thread Downloader, which having a separate client instance per thread achieves. But segfaults at different locations with shared downloader.

FrankUrbach commented 3 years ago

I think that's the nature of curl. In their documentation they have written that a connection can't used by different threads. One downloader at one task at a time IIRC. I will try to puzzle the things together for the TypeDBClient and will report how it works.

c42f commented 3 years ago

It could be the libcurl bug mentioned here: https://github.com/JuliaLang/Downloads.jl/issues/110#issuecomment-902576602

Or perhaps something else to do with our the way Downloads.jl integrates with curl. Downloads doesn't use the same easy handles from different threads so that should be ok. But IIUC there's other data which is implicitly shared by linking all the easy handles to the same multi handle (roughly to the same Downloader instance)

tanmaykm commented 3 years ago

From what is mentioned for CURL_LOCK_DATA_CONNECT, unfortunately it looks like connection cache is always shared when using Multi and there's no way to opt out of it. It also mentions that HTTP/2 streams can not be on different threads. From libcurl api doc:

CURL_LOCK_DATA_CONNECT

Put the connection cache in the share object and make all easy handles using this share object
share the connection cache.

Note that due to a known bug, it is not safe to share connections this way between
multiple concurrent threads.

Connections that are used for HTTP/1.1 Pipelining or HTTP/2 multiplexing only get additional
transfers added to them if the existing connection is held by the same multi or easy handle.
libcurl does not support doing HTTP/2 streams in different threads using a shared connection.
...
Note that when you use the multi interface, all easy handles added to the same multi handle will
share connection cache by default without using this option.

Even though we do not enable connection cache sharing, we would not be able to use the same downloader or even the same HTTP/2 connection across threads. Using a per thread gRPC client should abide by these restrictions, and unless there are more caveats things should work fine.