Closed robhowley closed 4 weeks ago
not 100% what to make this error in the py39 unit tests
ERROR sdk/python/tests/unit/permissions/auth/server/test_auth_registry_server.py::test_registry_apis[\nauth:\n type: no_auth\n-applied_permissions0] - RuntimeError: Failed to bind to address [::]:34361; set GRPC_VERBOSITY=debug environment variable to see detailed error message.
@dmartinol @redhatHameed could you take a look?
not 100% what to make this error in the py39 unit tests
ERROR sdk/python/tests/unit/permissions/auth/server/test_auth_registry_server.py::test_registry_apis[\nauth:\n type: no_auth\n-applied_permissions0] - RuntimeError: Failed to bind to address [::]:34361; set GRPC_VERBOSITY=debug environment variable to see detailed error message.
I cloned your repo and it succeeded with no issues, maybe a temporary env issue?
Some questions about this change:
offline_store
if there is no implementation yet? offline_store
deserves them, you probably need to invoke them on the store
instance in offline_store
module as well. Probably the feature_server
, which actually is more an online_server
, will not have an offline_store
at all, or may use a remote one.@franciscojavierarceo @tokoko
Some questions about this change:
- why adding the init/close methods to
offline_store
if there is no implementation yet?- if you really think that
offline_store
deserves them, you probably need to invoke them on thestore
instance inoffline_store
module as well. Probably thefeature_server
, which actually is more anonline_server
, will not have anoffline_store
at all, or may use a remote one.@franciscojavierarceo @tokoko
yea that's a good point. offline store doesn't make a ton of sense in the server case. was just trying to maintain consistency across the two, but that need not be the case. i'll remove.
@robhowley thanks, I'm not too sure about the changes though. can you elaborate on why we need this?
Is the client creation actually that expensive that it needs a separate init method? I think what I was pointing out in the comment you linked above was that create_client
was being called on every invocation rather than only during the first one and cached afterwards. The first one being slightly slower than others doesn't seem that bad to me.
I get why a separate initializer can be beneficial, but I'm a little uneasy that it pushes us into the mixture of sync/async methods. What if I'm invoking the online read the old-fashioned way w/o asyncio, do I still need to call async initializer first or is that only relevant for async calls?
The same goes for the close method. This has actually come up before in #4401. As I pointed out there, I'm not sure what closing an online store should actually mean. Do calls on a closed online store error out or do they still succeed, but need to recreate the resources? I'd rather not have to answer those questions at all unless you think we absolutely need to.
The async libs are often (eg aiobotocore, aioboto3, aioredis, aiopg) used in context managers bc they're meant to be closed when we're done w all the relevant operations. Figured this sort of thing comes up a lot and is handled in a few ways
app
in the online stores. Adding this wouldn't make much sense bc the sdk is meant to also be usable outside the context of a web service, so it's kinda ruled out.I figured it'd be a cleaner boundary adding open and close methods akin to an async context manager on FeatureStore than exposing ways to reach down into the OnlineStore and closing it there. Maybe we just go full async context manager? but that seemed a bit much at first pass. If online stores don't need that initialization and the user calls it anyway, then it's just a no-op.
Since the other aio libraries function similarly it would be a useful construct as async support gets added to other online stores.
With a little time to mull this over, I'm curious what everyone's thoughts are on this. I'm not attached to the open/close idea, but it does seem like there should be a way to clean up resources. Most of async support will come from libraries expecting to be used in an aync context manager, so some kind of equivalent mechanism should be available.
@robhowley how would the normal non-async workflow be affected with this? If we put connection creation in an async initializer that won't be called by a non-async user... would we implicitly run an initializer anyway during the first call?
The sync usage shouldn't be impacted. The async code paths need not be touched by the sync code, so I don't think any async initialization would be needed/called.
# in sdk mode, this shouldn't touch async code
# and as a result shouldn't require any async connections to be created
feature_df = FeatureStore().get_online_features(entities).to_df()
@franciscojavierarceo @tokoko im not exactly sure what to make of the failing integration tests. created an issues #4658 bc they don't seem to have passed in better part of two months. im assuming you're aware, just (1) thought it worth flagging and (2) curious if you had any opinions on how concerned i should be about it
@tokoko @franciscojavierarceo btw, some added context on this mr ... i've been running this in prod for a few days and reusing the client reduces event loop lag in the server since creation of the client is a blocking operation. the scalability and speed improvements are obviously quite good as a result. reusing the client, however, should be paired w the ability to clean up the resources.
would it perhaps be clearer to just go full async context manager? somewhat more explicit that this isnt going to impact the sync code paths?
@franciscojavierarceo just a heads up, the remaining integration test failure comes from here. the push works, but the subsequent retrieval to confirm it's there, in the case of dynamo, is async. this is a sync function so there's no running event loop. working on a fix to get this green
What this PR does / why we need it:
initialize
andclose
toFeatureStore
,Provider
,OnlineStore
andOfflineStore
update: added
pytest-asyncio
to ci which is +555 of the diffWhich issue(s) this PR fixes:
Addresses the performance penalty mentioned here by @tokoko
Misc