feast-dev / feast

The Open Source Feature Store for Machine Learning
https://feast.dev
Apache License 2.0
5.56k stars 996 forks source link

Materialization and serving fail when using Redis cluster for the online store #2297

Closed vstrimaitis closed 2 years ago

vstrimaitis commented 2 years ago

Expected Behavior

Using a Redis cluster as the online storage should work without errors.

Current Behavior

Using a Redis cluster as the online storage no longer works after updating from 0.17.0 to 0.18.0. The issue happens whenever Feast tries to connect to the Redis cluster - e.g. during feast materialize or during feature serving. The exact error that is raised looks like this:

Traceback (most recent call last):
  File "/usr/local/bin/feast", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1128, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1053, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1659, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 1395, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.7/site-packages/click/core.py", line 754, in invoke
    return __callback(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/click/decorators.py", line 26, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/feast/cli.py", line 444, in materialize_command
    end_date=utils.make_tzaware(datetime.fromisoformat(end_ts)),
  File "/usr/local/lib/python3.7/site-packages/feast/usage.py", line 269, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/feast/feature_store.py", line 1087, in materialize
    tqdm_builder=tqdm_builder,
  File "/usr/local/lib/python3.7/site-packages/feast/infra/passthrough_provider.py", line 168, in materialize_single_feature_view
    lambda x: pbar.update(x),
  File "/usr/local/lib/python3.7/site-packages/feast/infra/passthrough_provider.py", line 86, in online_write_batch
    self.online_store.online_write_batch(config, table, data, progress)
  File "/usr/local/lib/python3.7/site-packages/feast/usage.py", line 280, in wrapper
    raise exc.with_traceback(traceback)
  File "/usr/local/lib/python3.7/site-packages/feast/usage.py", line 269, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/feast/infra/online_stores/redis.py", line 184, in online_write_batch
    client = self._get_client(online_store_config)
  File "/usr/local/lib/python3.7/site-packages/feast/infra/online_stores/redis.py", line 164, in _get_client
    self._client = RedisCluster(**kwargs)
  File "/usr/local/lib/python3.7/site-packages/redis/cluster.py", line 517, in __init__
    **kwargs,
  File "/usr/local/lib/python3.7/site-packages/redis/cluster.py", line 1125, in __init__
    self.populate_startup_nodes(startup_nodes)
  File "/usr/local/lib/python3.7/site-packages/redis/cluster.py", line 1251, in populate_startup_nodes
    self.startup_nodes[n.name] = n
AttributeError: 'dict' object has no attribute 'name'

Steps to reproduce

online_store:
  type: redis
  redis_type: redis_cluster
  connection_string: "host1:port1,host2:port2,host3:port3"

Specifications

Possible Solution

Some investigation revealed that versions 0.17.0 and 0.18.0 differ in the sense that 0.18.0 uses RedisCluster from redis.cluster instead of rediscluster. As a result, the code here breaks because the new RedisCluster instance expects to receive a list of ClusterNode instead of a list of dicts. Temporarily we worked around this issue by creating a custom online store (by copying all of the code in the linked file 😁 ) and changing it like so:

+ from redis.cluster import ClusterNode

...

-               kwargs["startup_nodes"] = startup_nodes
+               kwargs["startup_nodes"] = [ClusterNode(**node) for node in startup_nodes]
                self._client = RedisCluster(**kwargs)

This worked for us, but I'm not sure if this is the best approach in general because I'm not too familiar with the codebase yet.

felixwang9817 commented 2 years ago

hey @vstrimaitis thanks for reporting this issue and for the detailed investigation! we'll take a look at this ASAP

adchia commented 2 years ago

Looks like this is still broken since we rolled back the PR due to some issues with the new version of redis clusters (see bug on redis-py: https://github.com/redis/redis-py/issues/2003).

Might need to roll back to redis-py-cluster to make this work properly

nialloh23 commented 2 years ago

Confirming that I'm also seeing this issue with Feast=0.18.1. I noticed that I get the following version compatibility warning. redis-py-cluster 2.1.3 has requirement redis<4.0.0,>=3.0.0, but you'll have redis 4.1.4 which is incompatible. Sharing in case it's helpful in any way.

adchia commented 2 years ago

See https://github.com/feast-dev/feast/pull/2347 for a rollback PR

zhoupan163 commented 1 month ago

I solved this issue,here is my code from redis.cluster import RedisCluster,ClusterNode startup_nodes =[] for node in nodes: c = ClusterNode(host=node['host'],port=node['port']) startup_nodes.append(c) redis_client = RedisCluster(startup_nodes=startup_nodes, password= password,decode_responses=True) Please pay attention to the source code. The parameter startup_nodes in RedisCluster must be the ClusterNode class.