Open ioquatix opened 4 days ago
Hey, thanks for taking a look and sharing your concerns. The goal here is to retain compatibility with code that uses Thread.current
.
I just started using Fiber[...]
for library internals in https://github.com/rmosolgo/graphql-ruby/pull/5034, and I'm open to migrating GraphQL-Ruby's usage to Fiber storage instead.
I'm also open to making this Thread.current
behavior opt-in somehow (instead of default). But could you help me understand the risk you see with it now?
Risks:
Context Misalignment: Thread-local variables are tied to a thread's specific operations, and copying them can lead to incorrect behavior in the new thread's context.
Shared Mutable State Risks: Copying mutable thread-local variables can introduce race conditions and data corruption.
Resource Mismanagement: Thread-local variables managing resources like database connections or file handles may be improperly shared or closed.
Framework/Library Assumptions: Frameworks relying on Thread.current
for logs, tracing, or error propagation may break or produce incorrect results.
Oh, I see, thanks for laying those out. In practice, copying the entries from Thread.current.keys
has fixed context-related issues (#3366, #3449, #3461, #4993), mostly because other libraries are already using Thread.current[....]
for Fiber-scoped variables.
Maybe that's the catch here: Thread.current[...]
is actually Fiber-scoped, right? So GraphQL-Ruby spins up new Fibers based on the parent fiber and runs everything on the same Thread. Those new Fibers are managed by GraphQL-Ruby as "children," so logically, passing along context makes sense (at least, it has so far).
Database connections (etc) is an interesting case. Rails is the elephant in the room, and so far, the best approach has been to manually implement context sharing: https://graphql-ruby.org/dataloader/async_dataloader#activerecord-connections
Have you run into real-world issues with copying context like this, in GraphQL-Ruby or elsewhere?
Maybe that's the catch here: Thread.current[...] is actually Fiber-scoped, right?
Yes, and the problems are the same.
Have you run into real-world issues with copying context like this, in GraphQL-Ruby or elsewhere?
It's hard for me to pinpoint exact failures since they are often soft and/or transient (some requests may fail or behave incorrectly).
While not directly related, an example of how context sharing can lead to incorrect execution: https://github.blog/security/vulnerability-research/how-we-found-and-fixed-a-rare-race-condition-in-our-session-handling/
Since it's entirely possible for things like RequestStore
to use Thread.current
and so on, there are situations where it can become problematic or behave unexpectedly.
There are probably situations where what you are doing is useful (as you've said for compatibility). If you are sure you control all fibers within a given thread, it might be safe.
However, this code makes me extremely uncomfortable, so I strongly advise you to encourage Fiber storage for inheritable state.
Moving GraphQL-Ruby's own Thread.current
usage to Fiber[...]
was easy enough, and seems to work out-of-the-box: https://github.com/rmosolgo/graphql-ruby/pull/5176
I'm open to removing this default behavior from GraphQL-Ruby, so I'll keep this issue open until I get to try it out.
I believe this code is extremely risky.
https://github.com/rmosolgo/graphql-ruby/blob/8a21eb17d58902b20867f18b4c25937b75baa830/lib/graphql/dataloader.rb#L80