volga-project / volga

Feature Engine for real-time AI/ML
Apache License 2.0
36 stars 4 forks source link

[On-Demand][Storage] Create a look up actor for Datasets #29

Open anovv opened 5 months ago

anovv commented 5 months ago

In current implementation client queries storages directly - it is ok for cases when querying is done outside feature definitions, which is the case for online/offline features (@pipeline annotations).

On-Demand features are different - we can use Dataset.get(keys=...) inside @on_demand function to get a data entry to further calculate it. This should delegate to a separate service/actor without calling client.

We also want to be able to aggregate many separate Dataset.get calls into one Dataset.get_many in context of offline on-demand features, this will reduce load on cold storage in offline materialization jobs

This will also need some sort of load balancing mechanism.