feast-dev / feast

The Open Source Feature Store for Machine Learning
https://feast.dev
Apache License 2.0
5.54k stars 993 forks source link

Support for ingesting flat json from text files and streams #140

Closed tims closed 5 years ago

tims commented 5 years ago

Is your feature request related to a problem? Please describe. A very common data format is json, we don't support ingesting arbitrary json from files (only ReatureRow protos encoded as json)

Describe the solution you'd like I propose we support ingesting flat json from files and streams (kafka + pubsub).

By flat json, I mean eg: {"a":1, "b":2} without any nesting. For convenience, any nested objects or arrays that are declared features will be treated as strings (keeping their json encoding).

I also suggest we remove support for ingesting FeatureRow protos encoded as json in text files. Because it's a quirky format that we only generate for logging and testing stores and we shouldn't impose it on people, we should ingest formats they would typically use themselves.

Describe alternatives you've considered We could later add support for extracting features from deeply nested json, but I don't think we should do this in the first pass.

tims commented 5 years ago

Done for pubsub and text files in PR #141

Kafka pending

woop commented 5 years ago

Thanks @tims. Let's reopen this once the requirement exists.