feast-dev / feast

The Open Source Feature Store for Machine Learning
https://feast.dev
Apache License 2.0
5.55k stars 993 forks source link

Add support for async path operations in the server #4617

Open robhowley opened 1 day ago

robhowley commented 1 day ago

Is your feature request related to a problem? Please describe.

Add opt in support for async path operations on the python feature server. This would allow the server to take advantage of the async libraries that are slowly becoming available for the various data stores. Lookups to multiple feature views with online stores such as dynamo and redis can be done in parallel. This would provide significant performance gains when calling feature services.

Describe the solution you'd like

A cli parameter on feast serve to indicate preference for deploying the async paths which will leverage the async feature store sdk methods.

# simple flag param
feast serve --async-read

Under the hood it would declare

@app.post(
    "/get-online-features",
    dependencies=[Depends(inject_user_details)],
)
async def get_online_features(body=Depends(get_body)):
    ...
    # and call
    response_proto = await store.get_online_features(

As async store.push functionality gets added an --async-push flag could be added as well. This approach is an explicit opt-in which is desirable when overriding default behavior. Lets the person deploying the server make informed decisions about performance based upon the backend being used. The caller need not worry about the internals of a service, just the api contract.

Describe alternatives you've considered

Additional context

Here is a flame graph for a call to a feature service with

image

We see default batching behavior and 3 calls per batch all of which are done sequentially. These can be done in parallel.

tokoko commented 1 day ago

Why don't we simply (at least) default to the async methods if they're available?

robhowley commented 1 day ago

Why don't we simply (at least) default to the async methods if they're available?

im def open to that, it's the least work for people hosting the servers. in slack, however, @franciscojavierarceo expressed a preference for an explicit opt-in and that position is def reasonable. personally, im fine either way and happy to implement either one.

iirc all data stores have an async method, just left unimplemented so we'd either want to remove it from the interface and check if the attribute exists or provide some kind of flag in a given impl saying async is available.

robhowley commented 1 day ago

made a branch of what the flag option would look like here