Closed jaredsnyder closed 3 months ago
Made a notebook to compare outputs. For most of the rows (where the optimization lands on the same parameters), the difference is within 1% but there are some significant outliers. Not sure if this is inevitable due to the stochastic nature of the prediction or not. Notably we didn't have this issue with prophet_forecast
https://colab.research.google.com/drive/1NxD5eVwZ0Vw3UhZtZoIev5Q44qEEjSTJ#scrollTo=CjguJpSKsv7k
The seed is set in both branches, so the results should be the same. Investigating
Validation NB is updated and all tests pass. Two changes were necessary:
_add_regressor
so 1 was assigned when true and 0 when False, while in main it is the other way around. This did change the values, usually by small amounts but occasionally by more. It may be worth revisiting whether or not this change is worth it. To check, I reverted this change to produce the _branch_0s
tablesValidation NB is updated and all tests pass. Two changes were necessary:
* A bug we fixed was switching the values in `_add_regressor` so 1 was assigned when true and 0 when False, while in main it is the other way around. This did change the values, usually by small amounts but occasionally by more. It may be worth revisiting whether or not this change is worth it. To check, I reverted this change to produce the `_branch_0s` tables * there was another bug involving using the aggregate function on strings. I updated the function and added some tests.
@jaredsnyder I added a bit to the notebook, checking for differences between values, and all differences were 0. The lil bit is right before the components section; have a look, make sure I didn't goof up the merge or sets I used?
Validation NB is updated and all tests pass. Two changes were necessary:
* A bug we fixed was switching the values in `_add_regressor` so 1 was assigned when true and 0 when False, while in main it is the other way around. This did change the values, usually by small amounts but occasionally by more. It may be worth revisiting whether or not this change is worth it. To check, I reverted this change to produce the `_branch_0s` tables * there was another bug involving using the aggregate function on strings. I updated the function and added some tests.
@jaredsnyder I added a bit to the notebook, checking for differences between values, and all differences were 0. The lil bit is right before the components section; have a look, make sure I didn't goof up the merge or sets I used?
@m-d-bowerman : that's because the tables you were looking at ran the data with that change reverted. I loaded the tables with the change and made some plots to compare the differences here https://colab.research.google.com/drive/1NxD5eVwZ0Vw3UhZtZoIev5Q44qEEjSTJ#scrollTo=Look_at_distribution_of_differences_when_changing_1_to_0
Updates look good :shipit: 🦜
Adds unit tests for funnel_forecast and does some light refactoring
Checklist for reviewer:
.circleci/config.yml
) will cause environment variables (particularly credentials) to be exposed in test logs