In current implementation client queries storages directly - it is ok for cases when querying is done outside feature definitions, which is the case for online/offline features (@pipeline annotations).
On-Demand features are different - we can use Dataset.get(keys=...) inside @on_demand function to get a data entry to further calculate it. This should delegate to a separate service/actor without calling client.
We also want to be able to aggregate many separate Dataset.get calls into one Dataset.get_many in context of offline on-demand features, this will reduce load on cold storage in offline materialization jobs
This will also need some sort of load balancing mechanism.
In current implementation client queries storages directly - it is ok for cases when querying is done outside feature definitions, which is the case for online/offline features (@pipeline annotations).
On-Demand features are different - we can use Dataset.get(keys=...) inside @on_demand function to get a data entry to further calculate it. This should delegate to a separate service/actor without calling client.
We also want to be able to aggregate many separate Dataset.get calls into one Dataset.get_many in context of offline on-demand features, this will reduce load on cold storage in offline materialization jobs
This will also need some sort of load balancing mechanism.