feast-dev / feast

The Open Source Feature Store for Machine Learning
https://feast.dev
Apache License 2.0
5.57k stars 997 forks source link

Update Java Client to Feast 0.3 API for Online Serving #257

Closed woop closed 5 years ago

woop commented 5 years ago

We need to update the Java SDK for online serving based on the changed serving API. This issue can track the proposed design and discussion.

davidheryanto commented 5 years ago

Current implementation of the Java SDK for Feast (only support online feature retrieval for now) is from this commit 718fd2737ae8ac7.

Example usage:

import com.gojek.feast.v1alpha1.FeastClient;

// Pass the host and port for Feast Online Serving server
FeastClient client = FeastClient.create("localhost", 6566);

// Feature id follows this format <feature_set_name>:<version>:<feature_name>
List<String> requestedFeatureIds = Arrays.asList("driver:1:driver_id", "driver:1:driver_name");

// Sample request to retrieve only entities with key: `driver_id: 123` and `driver_id: 456`
// For `driver_id: 123`, additionally put a condition that the event timestamp has a maximum
// value of 1 minute before current time, else the feature value will be null.
List<Row> requestedRows =  Arrays.asList(
  Row.create().set("driver_id", 123).setEntityTimestamp(Instant.now().minusSeconds(60)),
  Row.create().set("driver_id", 456));

// The returned rows will contain "driver_name" features
List<Row> retrievedFeatures = client.getOnlineFeatures(requestedFeatureIds, requestedRows);

Row represents a single row of data in Feast. It contains timestamp and fields (column names and column values) contained in the Row. The design of Row follows that in Apache Beam and Spark.

Following the example above, the client can retrieve the value for the driver feature like so:

// Size of retrieved rows always match size of requested rows for online feature retrieveal,
// so index 0 will retrieve features for driver "123".
// The client should know the data type of the feature they are retrieving.
String driverName = retrievedFeatures.get(0).getString("driver_name")

// The client can also retrieve all the fields available in the Row 
// where Value here represents "feast.types.ValueProto.Value" type 
Map<String, Value> fieldNameToValue = retrievedFeatures.get(0).getFields()

The SDK can additionally create utilities or extend Row so the returned features can be easily converted to common tabular data format such as DMatrix normally used in XGBoost library.

woop commented 5 years ago

Thanks @davidheryanto . We can iterate on the API over time. Closing this issue for now. Nice job.