legumeinfo / microservices

A collection of microservices developed and maintained by the Legume Information System
https://legumeinfo.org/
Apache License 2.0
3 stars 2 forks source link

micro_synteny_search exceeding redis maxclients limit #610

Open adf-ncgr opened 9 months ago

adf-ncgr commented 9 months ago

This is not a common problem, but happened to come up relative to a region of interest for a collaborator's PhD student and a paper he's trying to write up. Full stack trace at the end of this post but ,y reading of the situation based on the code around this part of the stack trace:

micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/micro_synteny_search/request_handler.py", line 69, in _queryToChromosomeGeneMatchIndexes
micro_synteny_search_1           |     result = await gene_index.search(query)

is that there is a gene in the region that belongs to a large gene family and this combined with the fact that the data source that's throwing the error (medicago) contains many highly fragmented genomes means that when the initial query is made for "chromosomes" with any matching gene families, the resource limit is being triggered (not sure I entirely understand why many "clients" are being spawned, but taking the error message at its word on that).

The most obvious thing to do is try to increase the maxclients setting in redis, but although I made an attempt at this, the error message I'm getting

micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 861, in read_response
micro_synteny_search_1           |     raise response from None
micro_synteny_search_1           | redis.exceptions.ResponseError: LIMIT exceeds maximum of 10000

still indicates that I'm exceeding the default limit of 10000, though this result

sudo docker exec -ti medicago_redis_1 redis-cli config get maxclients
1) "maxclients"
2) "50000"

suggests that my attempt to hack the increased --maxclients setting into the compose.yml file did in fact change it as intended. I am probably getting confused about something here- let me know if you have any thoughts about it when you're back @alancleary

If that server-side tweak ultimately works, it may be adequate as a solution (not suggesting we change the default, but could configurable via .env/document it for other service admins in future); but, we could also consider whether the parameters we added to exclude contigs below a certain size for the macrosynteny search might not also be applicable to the microsynteny search (so poor PhD students don't have to rely on crusty old service admins to get their work done...)

full stack trace (from services running under gcv-docker-compose-2.6.0-c0)

micro_synteny_search_1           | 17:54:40,759 ERROR: Error in callback for method [/legumeinfo.microservices.microsyntenysearch_service.v1.MicroSyntenySearch/Search]
micro_synteny_search_1           | Traceback (most recent call last):
micro_synteny_search_1           |   File "src/python/grpcio/grpc/_cython/_cygrpc/aio/server.pyx.pxi", line 735, in grpc._cython.cygrpc._add_callback_handler.handle_callbacks
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/micro_synteny_search/grpc_server.py", line 25, in exceptionCallback
micro_synteny_search_1           |     raise exception
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/micro_synteny_search/grpc_server.py", line 55, in Search
micro_synteny_search_1           |     return await self._search(request, context)
micro_synteny_search_1           |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/micro_synteny_search/grpc_server.py", line 38, in _search
micro_synteny_search_1           |     tracks = await self.handler.process(query, matched, intermediate)
micro_synteny_search_1           |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/micro_synteny_search/request_handler.py", line 105, in process
micro_synteny_search_1           |     chromosome_match_indexes = await self._queryToChromosomeGeneMatchIndexes(query_track, gene_index)
micro_synteny_search_1           |                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/micro_synteny_search/request_handler.py", line 69, in _queryToChromosomeGeneMatchIndexes
micro_synteny_search_1           |     result = await gene_index.search(query)
micro_synteny_search_1           |              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/redis/commands/search/commands.py", line 889, in search
micro_synteny_search_1           |     res = await self.execute_command(SEARCH_CMD, *args)
micro_synteny_search_1           |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/client.py", line 505, in execute_command
micro_synteny_search_1           |     return await conn.retry.call_with_retry(
micro_synteny_search_1           |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/retry.py", line 59, in call_with_retry
micro_synteny_search_1           |     return await do()
micro_synteny_search_1           |            ^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/client.py", line 481, in _send_command_parse_response
micro_synteny_search_1           |     return await self.parse_response(conn, command_name, **options)
micro_synteny_search_1           |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/client.py", line 524, in parse_response
micro_synteny_search_1           |     response = await connection.read_response()
micro_synteny_search_1           |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
micro_synteny_search_1           |   File "/usr/local/lib/python3.11/site-packages/redis/asyncio/connection.py", line 861, in read_response
micro_synteny_search_1           |     raise response from None
micro_synteny_search_1           | redis.exceptions.ResponseError: LIMIT exceeds maximum of 10000
adf-ncgr commented 9 months ago

quick follow-up; this may be a total coincidence but I happened to notice in the logs for the medicago_redis_1 container this startup info: <search> concurrent writes: OFF, gc: ON, prefix min length: 2, prefix max expansions: 200, query timeout (ms): 500, timeout policy: return, cursor read size: 1000, cursor max idle (ms): 300000, max doctable size: 1000000, max number of search results: 10000, search pool size: 20, index pool size: 8 so wondering whether the "LIMIT exceeds maximum of 10000" is not referring to redis maxclients but instead redisearch query result limits; seems likely, but need to figure out how to configure this to test the hypothesis.

adf-ncgr commented 9 months ago

OK, yes that was indeed it; after sudo docker exec -ti medicago_redis_1 redis-cli FT.CONFIG SET MAXSEARCHRESULTS 100000 the previously broken search started working just fine. But I will leave this issue open until we decide whether it makes sense to make this aspect configurable via .env or whether something else is a better solution long-term.

alancleary commented 9 months ago

Just to be clear, at first you thought the limit in the error message was the number of concurrent Redis connections (like in gcv-docker-compose issue #14) but it now seems to be the max number of search results returned by a Redisearch query? If it is the Redisearch max number of results then I think it's worth pursuing a solution that doesn't require tuning by site admins.

One soluton that comes straight to mind is paginating the Redisearch results. We know the number of genes we're asking for so it would be pretty straightforward to update the code to cobble together our set of genes by querying all the necessary pages so we don't exceed the query result limit. I'm wondering if the limit is something we can fetch from the Redis server when the microservice starts up so we don't have to hard-code it. This would help us avoid issues if Redisearch's default value for the limit ever changes.

adf-ncgr commented 9 months ago

yes, it's the redisearch max search results limit that seems to be the problem here. I was also wondering whether paging might address this in a way that lets us effectively be unlimited (without special config). I would think that redis-cli FT.CONFIG GET MAXSEARCHRESULTS would give you the needed value but having just tried it I get a value of 1000000 returned which seems at odds with the behavior that led me to tweak it in the first place. Anyway, in the master triage list of GCV-related issues, I'd say this one is not too urgent, but certainly nice to have fixed if easy enough to knock out. Thanks for following up

alancleary commented 9 months ago

It looks like that command is also available in the Redisearch Python library. I guess we'll have to verify if this actually a reliable way to go...