mozilla / docker-etl

Collection of dockerized ETL jobs managed by data engineering.
Mozilla Public License 2.0
19 stars 15 forks source link

Implement SKLearn interface #272

Closed jaredsnyder closed 1 month ago

jaredsnyder commented 2 months ago

Changes:

Checklist for reviewer:

jaredsnyder commented 2 months ago

Here's a notebook to validate the PR. We're not getting an exact match on the search forecasts but @m-d-bowerman and I have concluded the models match and the difference is due to how prophet sets the seed: https://colab.research.google.com/drive/1dLeLUz_99ln9PC1AG-izZILj9-zIJHmJ#scrollTo=70upJ3eUTvkh

jaredsnyder commented 2 months ago

Note on the validation: https://docs.google.com/document/d/1kG75iCFHSxBYVz6EcaYhozOZ9KfK7ncKvB5YfOmaB6I/edit?usp=sharing

jaredsnyder commented 1 month ago

WRT code complexity: Yeah that is the definite downside to trying to "promote" models with segments so they'd be easier to use. I can take another pass at documenting/commenting so it's easier to work with, and can brainstorm ways to clean it up. We could also meet to try and come up with something if you think that'd be useful

jaredsnyder commented 1 month ago

Another thing I want to look into is trying to use DARTs (https://unit8co.github.io/darts/) which might eliminate a lot of the wrapper code around prophet, and maybe some of the stuff for handling data too

bochocki commented 1 month ago

Darts does look neat! I tried to evaluate it as part of the KPI model selection exercise that we used to decide on prophet, but at the time they didn't have M1 support and that was enough of a blocker for local development that I didn't explore it further.