python-websockets / websockets

Library for building WebSocket servers and clients in Python
https://websockets.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
5.06k stars 505 forks source link

recv() is slow #1462

Closed gborrageiro closed 1 month ago

gborrageiro commented 1 month ago

Awaiting websocket.recv() appears to be quite slow. I am scheduling the below consumer() into a task group, nevertheless if we look at the latency statistics:

    async def consumer(self, websocket: WebSocketClientProtocol) -> None:
        s0 = time.time()
        msg = json.loads(await websocket.recv())
        s1 = time.time()
        self._ws_stats.update(s1 - s0)

we see a distribution of waiting times as follows:

ws latency stats: n=5.97e+03, mean=0.3, std=1.83, min=6.44e-06, p25=1.26e-05, p50=1.55e-05, p75=0.00815, p95=0.277, max=35

What would account for the latency here? Presumably if there is no message to receive, control should be returned to the event loop. If there is a message to process, surely the message has been sent already from the websockets server to my client and the waiting times should be better?

OfferLifted commented 1 month ago

Your timing includes the json.loads() call so your latency measurements include the overhead of those calls. Are you sure you want to measure both or do you just want to measure the await websocket.recv()? Consider timing the two separately like this:

    async def consumer(self, websocket: WebSocketClientProtocol) -> None:
        s0 = time.monotonic()
        msg = await websocket.recv()
        s1 = time.monotonic()
        self._ws_stats.update(s1 - s0)

        s0 = time.monotonic()
        msg = json.loads(msg)
        s1 = time.monotonic()
        self._json_stats.update(s1 - s0)

You can try switching to orjson instead of json if you need more speed.

gborrageiro commented 1 month ago

I have implemented what you have suggested and It's still pretty bad unfortunately.

orjson latency stats: n=1.16e+03, mean=7.43e-06, std=3.12e-06, min=2.37e-06, p25=5.48e-06, p50=7.05e-06, p75=8.84e-06, p95=1.21e-05, max=3.49e-05
    ws latency stats: n=1.16e+03, mean=2.19, std=9.51, min=5.79e-07, p25=9.7e-07, p50=0.00403, p75=0.0278, p95=12.8, max=136

Surprisingly these websockets receive latency stats are worse than Rest messages I send to the same destination geographically speaking, different server of course.

OfferLifted commented 1 month ago

What do you mean by receive latency? You have to make sure to measure the correct thing if you want your results to be helpful. Maybe look into timing asyncio operations. Without knowing anything else about your test it's hard to point to where the issue might be.

For latency this got added and you can refer this: https://github.com/python-websockets/websockets/issues/1195.

Also look at recv function https://github.com/python-websockets/websockets/blob/12.0/src/websockets/legacy/protocol.py#L502-L578

gborrageiro commented 1 month ago

recv() latency. And yes, I have taken a look at it.

aaugustin commented 1 month ago

Presumably if there is no message to receive, control should be returned to the event loop.

Yes. However, your stopwatch (time.time) is still ticking while the coroutine is at await websocket.recv(), waiting for the event loop to yield back control, so you're effectively measuring time spent waiting for the next message to arrive.