Open tokoko opened 1 month ago
The same as above but instead of creating a new component, we can reuse OfflineServer to do the request handling. This is slightly awkward from the naming perspective, but probably makes the most sense in term of usage/maintenance.
ATM the OfflineServer
was designed to implement the OfflineStore
interface only. Adding a method to write to the online stores would introduce an unplanned dependency and raise some concerns.
remote
) online_store
, which would raise again the original issue?OfflineServer
, then we'd need an actual online_store
configuration as well, which was not planned to have in such server (a remote
online store was designed instead). Not sure about any possible side effects.Why aren't we using the /materialize-incremental
endpoint on the FeatureServer
instead? (and add a new endpoint for non-incremental jobs)
This would avoid any "transport batches and batches of potentially huge datasets." as it would work on the server itself (and would use the remote offline_store
to pull_latest_from_table_or_query
using the flight protocol) .
Otherwise, I'd be in favour of a dedicated MaterializationServer
(with remote offline_store
and provided online_store
), which can still be designed as a "lightweight fastapi server" if I understood the materialization flow.
@tokoko do we want to evaluate this one? Any further comments on what solution to apply?
Is your feature request related to a problem? Please describe. We already have an option to run online/offline store queries remotely through feature server and offline server, respectively. This way rbac rules will be applied on operations. One piece that's missing is materialization. There are several ways to do this:
Keep materialization local, but rely on remote online/offline engines to apply rbac rules. This is currently impossible because remote online client doesn't implement
online_write_batch
method. Even if we did implement it in the feature server itself, we would essentially be using a lightweight fastapi server to transport batches and batches of potentially huge datasets.Create remote materialization engine that will defer the whole materialization call to a backend server and apply rbac rules there. We can create another server component MaterializationServer that will receive these requests.
The same as above but instead of creating a new component, we can reuse OfflineServer to do the request handling. This is slightly awkward from the naming perspective, but probably makes the most sense in term of usage/maintenance.
I'd probably go with option 3 as a starting point.