strawberry-graphql / strawberry

A GraphQL library for Python that leverages type annotations 🍓
https://strawberry.rocks
MIT License
3.98k stars 531 forks source link

Instantiated extensions are racy #2067

Open silas opened 2 years ago

silas commented 2 years ago

It looks like instantiated extensions passed to Schema are racy.

Example of racy usage:

https://github.com/strawberry-graphql/strawberry/blob/d6137f02a85c8659aedebac9c9e813ff72f267fe/tests/schema/extensions/test_opentelemetry.py#L200-L202

For OpenTelemetryExtension a single instance of _span_holder is going to be created and reused.

https://github.com/strawberry-graphql/strawberry/blob/d6137f02a85c8659aedebac9c9e813ff72f267fe/strawberry/extensions/tracing/opentelemetry.py#L31

Beyond just OpenTelemetryExtension, the setting of execution_context in ExtensionsRunner is racy:

https://github.com/strawberry-graphql/strawberry/blob/d6137f02a85c8659aedebac9c9e813ff72f267fe/strawberry/extensions/runner.py#L33-L40

You can trivially show this using the following example:

  1. Create app.py

    import asyncio
    from typing import Optional
    
    import strawberry
    from strawberry.extensions import Extension
    from strawberry.types.execution import ExecutionContext
    
    class TestExtension(Extension):
       def __init__(
           self,
           *,
           execution_context: Optional[ExecutionContext] = None,
       ):
           if execution_context:
               self.execution_context = execution_context
    
       def on_request_start(self):
           print(f"on_request_start: self={id(self)} execution_context={id(self.execution_context)}")
    
       def on_request_end(self):
           print(f"on_request_end: self={id(self)} execution_context={id(self.execution_context)}")
    
    @strawberry.type
    class Query:
       @strawberry.field
       async def test(self, sleep: int) -> int:
           await asyncio.sleep(sleep)
           return sleep
    
    schema = strawberry.Schema(query=Query, extensions=[TestExtension()])
  2. Then run the following:

    strawberry server app --port 3030
  3. In a new window make the following request:

    curl 127.0.0.1:3030/graphql -H 'content-type: application/json' --data '{"query": "{ test(sleep: 30) }"}'
  4. And while that request is running, make the concurrent request:

    curl 127.0.0.1:3030/graphql -H 'content-type: application/json' --data '{"query": "{ test(sleep: 1) }"}'
  5. If you look at the output of app.py you'll see the following:

    Running strawberry on http://0.0.0.0:3030/graphql 🍓
    [2022-08-03 13:04:24]: No operation name
    { test(sleep: 30) }
    
    on_request_start: self=4478445024 execution_context=4478640672
    [2022-08-03 13:04:28]: No operation name
    { test(sleep: 1) }
    
    on_request_start: self=4478445024 execution_context=4478932016
    on_request_end: self=4478445024 execution_context=4478932016
    on_request_end: self=4478445024 execution_context=4478932016

    You'll notice there is no on_request_end for 4478640672 and a double execution of 4478932016.

Upvote & Fund

Fund with Polar

jthorniley commented 1 year ago

FWIW this also affects the ValidationCache extension. It must run the on_validation_start hook after parsing is complete so that the execution_context.graphql_document is available - but the execution_context may have been replaced with the one for a subsequent request (so it may be None, or perhaps worse, one for a different request).

My current workaround is to replace the in built validation cache with one that is not directly instantiated: https://gist.github.com/jthorniley/f73f4866b229fd0579ac7fea37051806

DGolubets commented 9 months ago

I've just noticed this behavior. I expected on_request_start to have exactly one corresponding on_request_end