strawberry-graphql / strawberry

A GraphQL library for Python that leverages type annotations 🍓
https://strawberry.rocks
MIT License
3.87k stars 511 forks source link

Errors when closing a subscription after authentication failure #3400

Open wlaub opened 5 months ago

wlaub commented 5 months ago

Describe the Bug

When a permissions class's has_permission method returns False for a subscription, strawberry subsequently throws some errors while closing the connection.

System Information

Additional Context

When a subscription fails due to an authentication failure, we see log outputs that look like this

< TEXT '{"type":"connection_init","payload":{"Authoriza...bGt1hupQ9QVf1VYivKHw"}}' [844 bytes]
> TEXT '{"type": "connection_ack"}' [26 bytes]
< TEXT '{"id":"5","type":"start","payload":{"variables"...ypename\\n  }\\n}\\n"}}' [3034 bytes]
> TEXT '{"type": "error", "id": "5", "payload": {"messa...buildingDataChanges"]}}' [144 bytes]

Not Authorized
GraphQL request:2:3
...
    raise PermissionError(message)
PermissionError: Not Authorized

< TEXT '{"id":"5","type":"stop"}' [24 bytes]

Exception in ASGI application
Traceback (most recent call last):
...
  File ".../strawberry/subscriptions/protocols/graphql_ws/handlers.py", line 193, in cleanup_operation
    await self.subscriptions[operation_id].aclose()
KeyError: '5'

= connection is CLOSING
> CLOSE 1000 (OK) [2 bytes]
= connection is CLOSED
! failing connection with code 1006

closing handshake failed
Traceback (most recent call last):
...
  File ".../websockets/legacy/protocol.py", line 935, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedError: sent 1000 (OK); no close frame received

connection closed

This seems to indicate that the permissions failure prevents the subscription from being created, which causes cleanup to fail since it assumes the subscription exists. If I modify https://github.com/strawberry-graphql/strawberry/blob/808d898a9041caffe74e4364314b585413e4e5e2/strawberry/subscriptions/protocols/graphql_ws/handlers.py#L192 to check for operation_id in self.subscriptions and self.tasks before accessing them, then both the KeyError and closing handshake failed errors go away.

Since an expired token can cause rapid subscription retries and failures, this can produce quite a lot of log spam.

Upvote & Fund

Fund with Polar

DoctorJohn commented 4 months ago

Hi @wlaub, thanks for reporting!

This one is tricky. You're using the deprecated legacy graphql-ws websocket subprotocol for subscriptions which has a pretty bad protocol specification. This specification does not clearly state whether a client is allowed to send a stop message after receiving a error message from the server.

I would recommend migrating your clients to the graphql-transport-ws websocket subprotocol which has a much clearer documentation. The graphql-transport-ws protocol specification states that an error message "terminates the operation and no further messages will be sent.". Our interpretation and thus the implementation of this assumes that sending a stop message to the server after receiving a error message from the server, would be invalid according to the protocol.

Since the old protocol does not clarify how to handle this scenario, the implementation was oriented on the newer protocol.