feast-dev / feast

The Open Source Feature Store for Machine Learning
https://feast.dev
Apache License 2.0
5.58k stars 996 forks source link

How to get features for all entities in online serving? #1361

Open michcio1234 opened 3 years ago

michcio1234 commented 3 years ago

Is your feature request related to a problem? Please describe. At inference time I would like to retrieve values of one given feature for all entities stored currently in a feature table, without knowing what IDs the entities have. E.g. I would like to retrieve trips_today for all the drivers that are currently in driver_trips table.

Describe the solution you'd like Ideally, I could do:

features = client.get_online_features(
    feature_refs=["driver_trips:trips_today"],
    entity_rows=None,
).to_dict()

and get

{
    "driver_id": [... list of all the driver ids ...]
    "driver_trips:trips_today": [... list of their respective feature values ...]
}

Describe alternatives you've considered I could hack my way around by having a separate "meta" table which I update every time I add new entities. Something like:

dummy = feast.entity.Entity(
    name="dummy", 
    description="Dummy entity", 
    value_type=feast.value_type.ValueType.INT64
)

all_driver_ids = feast.feature.Feature("all_driver_ids", dtype=feast.value_type.ValueType.INT32_LIST)

meta_table = feast.feature_table.FeatureTable(
    name = "meta_driver_table",
    entities = ["dummy"],
    features = [
         all_driver_ids,
    ],
    ...
)

Then I could query this table for a known entity id, get the list of driver ids, and then query my actual feature table for all drivers. But that's a really hacky workaround.

Additional context I realise it's kind of using Feast as a database. If this is totally out of Feast philosophy, please enlighten me. In such case, what alternative could I use?

Thanks.

woop commented 3 years ago

@michcio1234 Your requirement is one I have heard before, it's definitely not uncommon. It's one that is very hard for us to support with databases like Dynamo/Bigtable/Redis since it requires scanning over keys or maintaining an index ourselves. We aren't yet sure whether we want to support this functionality ever, although it's one that we are super actively discussing right now.

lucasbmiguel commented 3 years ago

This same issue occurs for the get_historical_feature right?

woop commented 3 years ago

This same issue occurs for the get_historical_feature right?

Yes, the same limitation is on the get_historical_features() method, although in that case we are more open to having range scans over an offline store.

For what it's worth @michcio1234, there is an active discussion on the Feast mailing list around this functionality. It would be great if you could join that discussion and throw in your 2c.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.