Open franciscojavierarceo opened 1 week ago
torch feature is nice. I guess we need to release the "timestamp" constraints in our APIs, since it probably doesn't make too much sense to attach embedding feature with a timestamp?
The method store.get_online_features(...)
returns an OnlineResponse
object that has some conversion methods like to_dict()
and to_df()
. Should this suggestion be implemented as an another conversion method like to_torch()
or something like this?
torch feature is nice. I guess we need to release the "timestamp" constraints in our APIs, since it probably doesn't make too much sense to attach embedding feature with a timestamp? Agreed.
@breno-costa that code is a serialization step though. We would want to treat Torch Tensors (or xgb.DMatrix) as a first class data type.
The concrete examples I'm thinking of are one hot encoding or impact encoding. It'd be useful for us to handle this for MLEs natively, especially when handling unseen categories.
This plus sparse tensors/sparse matrices could be a really cool optimization -- less data, faster io, more powerful API.
This plus sparse tensors/sparse matrices could be a really cool optimization -- less data, faster io, more powerful API.
Exactly.
if we can leverage "arrow" as our primary format, then it can be directly converted to pandas/torch with arrow apis i believe
Cool, I'll check that out. This is basically the next step after vector support to making NLP a first class citizen.
Is your feature request related to a problem? Please describe. We should allow Feature Views to return matrices/tensors natively. For example,
torch.tensors
.At the moment, for some features we require the client to serialize the output into a matrix before running inference. Feast should support executing these transformations and serializing the data into matrices for both online and offline retrieval.
Describe the solution you'd like
Describe alternatives you've considered Not supporting this is the alternative, which is the current state, which leaves users to write their own brittle logic to handle various complexities.
Additional context @HaoXuAI @tokoko I know we discussed sklearn pipelines in the past and I thought I'd share my thoughts.