volga-project / volga

Feature Engine for real-time AI/ML
Apache License 2.0
36 stars 4 forks source link

[SE][API] Proper compile OperatorNodeGraph to streaming JobGraph #34

Open anovv opened 6 months ago

anovv commented 6 months ago

This consists of 2 parts.

  1. Build a full stream graph taking all of the target Dataset's dependencies (https://github.com/anovv/volga/blob/master/volga/client/client.py#L123)
  2. In case the dependent Dataset is already computed/in storage, we want to update corresponding part of streaming graph from streaming nodes to source nodes (read from storage rather than re-compute)

This may require implementing a Features Metadata Store to see which values are computed/missing (similar to what Usman proposed)