graphql-python / gql

A GraphQL client in Python
https://gql.readthedocs.io
MIT License
1.56k stars 180 forks source link

New transport AIOHTTPWebsocketsTransport #478

Closed tlowery-scwx closed 4 months ago

tlowery-scwx commented 6 months ago

This represents ongoing work addressing issues #418 and #451, adding AIOHTTP Websockets support with a new AIOHTTPWebsocketsTransport. We already have a working base model, and are working on full parity and test coverage, and wanted to make this work available for review or comment.

This transport allows to run queries including subscriptions on websockets endpoints without using the websockets dependency (only the aiohttp dependency).

Note: a lot of code had to be duplicated from the WebsocketsTransport class. A refactor will be done to reduce the duplicated code in a subsequent PR.

leszekhanusz commented 6 months ago

Thanks for your PR. To fix the lint issue, you can use make check (See CONTRIBUTING.md)

tlowery-scwx commented 4 months ago

Hi @leszekhanusz, I'm wondering if you could give us some guidance on the final stage of this PR. We've been pursuing 100% test coverage, and we're down to a single edge case where a TransportClosed exception would be raised here and bubble up here. In our case, in order to implement a Transport using AIOHTTP rather than Websockets, we had to duplicate, rather than inherit from, WebsocketsTransportBase. We've been having trouble replicating the circumstances under which the mentioned edge cases would occur, and on further investigation, we found that the corresponding lines in WebsocketsTransportBase are only covered by the tests for the PheonixChannelWebsocketsTransport -- and that their test which covers this functionality hangs indefinitely when a WebsocketsTransport is substituted for a PheonixChannelWebsocketsTransport. Since PRs require 100% test coverage, we're at something of an impasse -- can you recommend a course of action for addressing this edge case which is not tested by the base class?

leszekhanusz commented 4 months ago

I'll take a look soon.

leszekhanusz commented 4 months ago

Hi @tlowery-scwx. Thanks for your work so far.

I took the liberty to take over your branch to try to fix the remaining problems, you can check the commits above.

I had to copy yet again some code from the previous websockets_base.py file (ListenerQueue...) so that it would be possible to run the new transport without needing the websockets dependency. I'll proceed in two steps, first put everything in the aiohttp_websockets.py file and make it work in this PR, then when this PR has been merged, I'll refactor everything to follow DRY in another PR.

The aiohttp_websocket pytest mark has been removed. There was a bit of confusion here, the goal of those pytest marks are there to select tests based on the installed dependencies, so only aiohttp and websockets marks make sense. Something annoying now is that we still need the websockets dependency to test the new transport as we still use websockets for the server part in the tests, so both marks are set for the tests of the new transport. I'll try to implement a new websocket server using the aiohttp dependency so that we could be sure that all tests are passing without the previous websockets dependency.

One of the main problem was that the aiohttp session was not closed properly. For that it is a bit tricky because of still remaining aiohttp issues which should be fixed in aiohttp 4.0 I suppose. Fortunately we are already doing that for our aiohttp transport so I copied the code from there.

During the testing, it appeared that the aiohttp session was still not closed properly for some tests, and it revealed that the problem was already present for the previous websockets transport for some conditions. I fixed that in the PRs #486 and #488, which have been merged in this branch and made the required changes in this branch.

Regarding the edge case with the TransportClosed exception in the _receive method, to fix the code coverage, at first I simply replaced that check with an assertion, as there is not point to have code for which we don't know how to trigger it. I see now that you have made a test which is skipped where you set the transport.websocket to None and discovered other problems here (hanging indefinitely). I'm now thinking reverting that change and reinforcing the _close_coro method so it is more resilient. And instead of checking if self.websocket is None in the _fail method (to see if the transport has already been closed), we can instead check the status of the self._wait_closed event instead. I'll implement that in a following commit. We should also add a configurable timeout for waiting for the close in case of unforeseen problems there.

If you don't mind I'll continue to work on this branch for a few days before asking for your review so please don't add new commits for the time being.

leszekhanusz commented 4 months ago

@tlowery-scwx It should be ok now. What do you think?

tlowery-scwx commented 4 months ago

@leszekhanusz, thank you so much for taking this on, putting so much work into it, and getting it over the finish line! May we have a day to test this transport with our use case and get back to you by EOD tomorrow (USA Pacific time) before you merge and close this out?

tlowery-scwx commented 4 months ago

Hi @leszekhanusz -- sorry for taking a bit longer than planned, but we've done quite a bit of testing, and everything looks great from our end. Thank you again for all of your work and help getting this done!