Open lefnire opened 4 years ago
Did some digging around the links above. facebook/Prophet's multivariate + feature_importances support is fairly experimental/limited, but worth a shot. There's no aspect of cause/effect in Prophet, it's just a (powerful) time series forecasting model.
But there are real cause-effect models! Looks like Judea Pearl really made a splash. I need to read the paper under "cause/effect models" above and continue investigating those projects. They product DAGs (directed acyclic graphs) which I could walk to each Field's parent to find it's original cause - awesome. What's unclear is which of these is time-series aware, they all seem stationary (comparing fields<->fields within each day). Also, uber/causalml - which seems the most active/powerful, requires "treatments" as an input variable, which I understood to be an output; where microsoft/dowhy is just wildly complex/confusing. So lots of reading to do.
DoWhy or CausalNex. ChatGPT Convo which shows how to go about this (lots of how-to code in there, it could be a simple weekend project!)
Turns out the current implementation (XGBoost's feature_importances) isn't too far from the mark, given how I'm rolling timeseries, so I'm gonna punt on this since it's halfway decent. But the next step should be a move towards DoWhy or CausalNex. Would be great to not only save the influencer score (importances, like we see now); but also an exported graph.png of the bayesian graph, showing flow from A->B->C (uploaded to S3 for that user).
Currently using a so-so correlation setup via XGBRegressor on time-lagged field-entries, 5 day windows. It's pretty bad theory, but works decent in practice. We'll want to move to something more solid for time-series analysis.
Darts (which includes Prophet) (try this one first)
XGBoost (current setup)
Facebook Prophet
Cause/effect models
Other
lime(this is for general model-explaining)