eugeneyan / eugeneyan-comments

1 stars 1 forks source link

https://eugeneyan.com/writing/feature-stores/ #34

Open utterances-bot opened 3 years ago

utterances-bot commented 3 years ago

Feature Stores - A Hierarchy of Needs

Access, serving, integrity, convenience, autopilot; use what you need.

https://eugeneyan.com/writing/feature-stores/

eggie5 commented 3 years ago

data point for Grubhub recsys: common periodic-offline job creates features for training and serving (eliminating skew). Hive snapshots for access (sharing) and published to Cassandra for serving. heavy run-time feature caching for serving. integrity maintained via ad-hoc monitoring/alerting with datadog and lineage tracking via ml-metadata

Grubhub's recsys use-case is a little more simple than some as we currently don't support real-time features computed in-between offline job runs (typically where you see Flink et al applied). Other groups like logistics might use online-feature generation.

amommendes commented 3 years ago

great framework to think about feature store implementation

hovinh commented 3 years ago

Thank you for the great post, Eugene. I have a few follow-up questions:

Thank you :)

eugeneyan commented 3 years ago
tobycheese commented 1 year ago

Hi Eugene, would it be possible to add a publication date at the top of your articles? Only with a date statements such as "Last month, Splice Machine, a big data platform, launched its own feature store too." make sense. Thanks!