triton-inference-server / redis_cache

TRITONCACHE implementation of a Redis cache
BSD 3-Clause "New" or "Revised" License
11 stars 4 forks source link

HMSET is deprecated as of Redis 4.0.0 #12

Open zbloss opened 9 months ago

zbloss commented 9 months ago

https://github.com/triton-inference-server/redis_cache/blob/437e7e214ff446e6bc12febb44872547a1988fe1/src/redis_cache.cc#L220

Triton inference server is failing to write to Redis on versions past 4.0.0 because of this hmset command.

Replacing it with hset should be a quick fix

https://redis.io/commands/hmset/

zbloss commented 9 months ago

I could use help doing additional testing on this fix: https://github.com/triton-inference-server/redis_cache/pull/13

I'm on a mac M1 so testing on all platforms is a pain.

slorello89 commented 9 months ago

@zbloss - HMSET is deprecated but hasn't been removed from the Redis command set. I'm curious to know what you are working against where this would fail?

zbloss commented 9 months ago

I have triton inference server via pytriton connecting to redis via the bitnami/redis helm chart.

redis: docker.io/bitnami/redis:7.2.3-debian-11-r2 redis-sentinel: docker.io/bitnami/redis-sentinel:7.2.3-debian-11-r2 Triton Inference Server: nvcr.io/nvidia/tritonserver:23.10-pyt-python-py3


│ +----------------------------------+------------------------------------------+                                                                                                                                    │
│ | Option                           | Value                                    |                                                                                                                                    │
│ +----------------------------------+------------------------------------------+                                                                                                                                    │
│ | server_id                        | triton                                   |                                                                                                                                    │
│ | server_version                   | 2.39.0                                   |                                                                                                                                    │
│ | server_extensions                | classification sequence model_repository |                                                                                                                                    │
│ |                                  |  model_repository(unload_dependents) sch |                                                                                                                                    │
│ |                                  | edule_policy model_configuration system_ |                                                                                                                                    │
│ |                                  | shared_memory cuda_shared_memory binary_ |                                                                                                                                    │
│ |                                  | tensor_data parameters statistics trace  |                                                                                                                                    │
│ |                                  | logging                                  |                                                                                                                                    │
│ | model_repository_path[0]         | /root/.cache/pytriton/workspace_tyzo6mi8 |                                                                                                                                    │
│ | model_control_mode               | MODE_EXPLICIT                            |                                                                                                                                    │
│ | strict_model_config              | 0                                        |                                                                                                                                    │
│ | rate_limit                       | OFF                                      |                                                                                                                                    │
│ | pinned_memory_pool_byte_size     | 268435456                                |                                                                                                                                    │
│ | min_supported_compute_capability | 6.0                                      |                                                                                                                                    │
│ | strict_readiness                 | 1                                        |                                                                                                                                    │
│ | exit_timeout                     | 30                                       |                                                                                                                                    │
│ | cache_enabled                    | 1                                        |                                                                                                                                    │
│ +----------------------------------+------------------------------------------+                                                                                                                                    │

Example Error:


│ I1220 18:01:45.920780 73 model.py:289] Collecting input data from request.                                                                                                                                         │
│ I1220 18:01:45.998198 73 model.py:211] Sending requests InferenceHandlerRequests(requests=[MetaRequestResponse(idx=0, data={'sequence_of_text': 'psm_3d342308:0'}, parameters={}, eos=None)]).                     │
│ I1220 18:01:45.999421 73 model.py:216] using sender <1.model._ResponsesSender object at 0x7fffd86826b0>                                                                                                            │
│ I1220 18:01:48.574288 73 model.py:218] Received response: InferenceHandlerResponses(responses=[MetaRequestResponse(idx=0, data={'response': 'psm_3d342308:66'}, parameters=None, eos=False)], error=None)          │
│ I1220 18:01:48.580447 73 infer_response.cc:167] add response output: output: response, type: FP32, shape: [1,1,384]                                                                                                │
│ I1220 18:01:48.581024 73 http_server.cc:1186] HTTP using buffer for: 'response', size: 1536, addr: 0x7ffefc004840                                                                                                  │
│ E1220 18:01:48.585030 73 dynamic_batch_scheduler.cc:686] Failed to insert key [14775872642165677484] into response cache: Failed to insert key: 14775872642165677484. ERR unknown command 'HMSET', with args begin │
│ ning with: '14775872642165677484' '0:i' '0' '0:t' '0' '0:b' '' '0:s' '1604'                                                                                                                                        │
│ I1220 18:01:48.606229 73 http_server.cc:1260] HTTP release: size 1536, addr 0x7ffefc004840                                                                                                                         │
│ I1220 18:01:48.608892 73 infer_request.cc:117] [request id: 0] Setting state from EXECUTING to RELEASED                                                                                                            │
slorello89 commented 9 months ago

Ahh, I think your issue is with Sentinel not HMSET. I'm assuming the library is talking to the sentinel and not Redis (so HMSET and HSET would both not be present). Sentinel support hasn't been added.

zbloss commented 9 months ago

Yep you're right, disabling sentinel removes the error and I see the cache working as expected.

Are there any plans to add support for redis-sentinel or redis-cluster?