Open ubuddemeier opened 1 month ago
Hello, thanks for pointing this out.
How did you observe the increasing memory load? Are you using any profiler?
@ckeshava , I'm noticing the same, it's best seen when you do a larger requests like requesting BookOffers. If you don't iterate over the client, the queue continuously grows and therefor also the memory usage.
Would it make sense to also have a limit on the amount of messages in the queue for the case when a client can't keep up?
import asyncio
from xrpl.asyncio.clients import AsyncWebsocketClient
from xrpl.models.requests import BookOffers
from xrpl.models.currencies import XRP, IssuedCurrency
async def main():
async with AsyncWebsocketClient("wss://s1.ripple.com") as client:
while True:
orderbook_asks_info = await client.request(
BookOffers(
ledger_index="current",
taker_gets=XRP(),
taker_pays=IssuedCurrency(
currency="USD",
issuer="rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq"
),
limit=500,
)
)
orderbook_bids_info = await client.request(
BookOffers(
ledger_index="current",
taker_gets=IssuedCurrency(
currency="USD",
issuer="rhub8VRN55s94qWKDv6jmDy1pUykJzF3wq"
),
taker_pays=XRP(),
limit=500,
)
)
print(f"Queued items: {client._messages.qsize()}")
await asyncio.sleep(2)
if __name__ == "__main__":
asyncio.run(main())
thanks, this is helpful @Jbekker
@Jbekker I have incorporated your test case at the tip of this branch: https://github.com/XRPLF/xrpl-py/pull/713 -- many thanks, this is insightful.
Regarding placing limits on the queue size: A client might want to send in a large number of requests to a rippled node. For instance - We don't want to rate-limit access to a personal rippled process. (Hopefully with the resolution of this bug, there won't be any unforeseen bumps in compute/memory usage). Users can always set limits on this Python process through the Operating System tools.
But, I understand this approach is contrary to the message queues inside the OS network buffers. I don't have a strong opinion, I'm open to ideas.
Regarding placing limits on the queue size: A client might want to send in a large number of requests to a rippled node. For instance - We don't want to rate-limit access to a personal rippled process.
There could be some sort of config option added.
Regarding placing limits on the queue size: A client might want to send in a large number of requests to a rippled node. For instance - We don't want to rate-limit access to a personal rippled process.
There could be some sort of config option added.
yeah, that's true. To determine a good default value, we might have to pull some statistics (for an upper bound on the queue size), do you have any ideas for that? I don't think the Data team currently tracks the client libraries.
Adding a config option might increase the complexity of the client libraries, but I think it's an acceptable compromise.
Question: does the response need to go into the message queue if it corresponds to a pending request and is handled by fulfilling the future? I did a quick fix on my end to work around the problem in this way:
if "id" in response_dict and response_dict["id"] in self._open_requests:
self._open_requests[response_dict["id"]].set_result(response_dict)
else: # enqueue the response for the message queue
cast(_MESSAGES_TYPE, self._messages).put_nowait(response_dict)
Am I missing something here?
Another problem I noticed earlier in that same file, is that _inject_request_id might cause a collision when generating a random ID with a lot of outstanding requests. It should check whether the randomly generated ID is already used.
@ubuddemeier You are correct, the fix is similar: https://github.com/XRPLF/xrpl-py/pull/713
Regarding the request-IDs, please refer to this discussion: https://github.com/XRPLF/xrpl-py/issues/718#issuecomment-2195457626
For a collision to occur, the random
math library would need to produce the same output within a given span of time. I think math libraries have a stronger guarantee insofar as psuedo-random number generation is concerned.
The requests are fulfilled very quickly (except for certain heavy workloads). I don't think we'll see collisions within such a short span of time
I think the problem is that received messages are enqueued (in WebsocketBase._handler) regardless of whether the message corresponds to a request or not:
https://github.com/XRPLF/xrpl-py/blob/main/xrpl/asyncio/clients/websocket_base.py#L139
This should probably be inside of an else-clause to the preceding if-clause.