Open shaunc opened 2 years ago
Thanks for filing an issue!
I believe this isn't supported yet, but we have heard requests for this. It should be relatively easy to support though by adding a dummy timestamp automatically on users behalf in generating training data
Would you accept a pr for this? Haven't checked yet -- but seems plausible that it would be simple for Feast itself to create a fake timestamp -- I see event_timestamp_column
is typed Optional[str] = ''
. I presume that empty-string means "sniff data for timestamp column." Perhaps if this were explicitly passed as None
you could automatically generate timestamp as "oldest possible"?
Yeah of course. PR here would definitely be encouraged!
Slightly trickier might be changing the point in time queries to use the dummy timestamp generated. Let me know if you need help debugging / getting a test environment setup!
Great! ... Though I'm at least a couple weeks away from actually doing anything -- right now researching and writing an engineering plan. (Know of any integrations with kedro? :))
cc @felixwang9817 and @samuel100 btw who were discussing this too
Don't think anyone's tried to integrate with Kedro either, but would love to see someone give it a stab :)
[We will be using kedro as "glue" ... we run workflows in argo-workflow, and kedro can build them for us. Obviously Feast will help us out with our feature metadata -- a current project has 4000-some-odd features and ... some are definitely broken! :). Feast could help us keep track of what they are and where they come from; great-expectations perhaps can tell us e.g. if they don't have the right invariant properties for our ML... which we would like to get back into the same postgres database which has the feast metadata. The question is if the ultimate source of authority is python code, and kedro is gluing things together, we need to figure out how to wrap Feast in a kedro-aware way or vice-versa.... Anyway -- I should start a different issue for that I guess once I am more opinionated. :)]
[We will be using kedro as "glue" ... we run workflows in argo-workflow, and kedro can build them for us. Obviously Feast will help us out with our feature metadata -- a current project has 4000-some-odd features and ... some are definitely broken! :). Feast could help us keep track of what they are and where they come from; great-expectations perhaps can tell us e.g. if they don't have the right invariant properties for our ML... which we would like to get back into the same postgres database which has the feast metadata. The question is if the ultimate source of authority is python code, and kedro is gluing things together, we need to figure out how to wrap Feast in a kedro-aware way or vice-versa.... Anyway -- I should start a different issue for that I guess once I am more opinionated. :)]
This is great @shaunc. I'd love to get a bit more details about your use case. If we have that then we can spend a bit of time figuring out what an integration would look like. Integrating with upstream data tooling is certainly something we've spoken about a lot before.
Thanks for the encouragement! ... Give me a few days to think; I'll create a new issue with more details and further thoughts.
Quick ping on this. Any further thoughts on this?
We still think this is a good idea -- but we decided to focus our efforts on experiment tracking first -- see kedro-dvc. Feature management will still be an issue, so I'm planning on circling back -- probably in the June timeframe. If you want to move forward yourself or if you hear of any other work, I'm all ears, though! :)
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Run into same situation as well
Is your feature request related to a problem? Please describe. A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
I am researching how to integrate Feast with my model building process. We typically have some data sources that describe objects that have no timestamp. A canonical example is a geographical area. In theory they have changing characteristics, but for the problem scope for which we build a model, they are considered to be immutable, and our data contains no timestamp field for them. However, your documentation says Feast uses a time-series data model to represent data
Describe the solution you'd like A clear and concise description of what you want to happen.
I would like a way to represent data sources which have no timestamp. Alternately, if this is already possible, I suggest making the documentation clearer on this issue.
Describe alternatives you've considered A clear and concise description of any alternative solutions or features you've considered.
I could add dummy timestamps to data files.
Additional context Add any other context or screenshots about the feature request here.