Closed laresbernardo closed 1 week ago
Thanks for updating @laresbernardo . Would you please test the current change vs the main branch on refresh each to see if everything is identical? It'd be identical when you compare the JSONs and the list "hyper_values" are having the same values. Also when testing refresh, make sure to test 2 separate refresh steps (rf1 & rf2) to see if the whole chaining etc work properly. Thanks!
Hey @gufengzhou
Yes, they give the same hyper_values. The decimals are rounded somewhere, so we get 0.5769212583 vs 0.5769213, but looking good.
The issue here is that Robyn$listInit$OutputCollect$hyper_updated
is not available when you start building the first refresh model. That is why the user was having issues and crashed. We only have Robyn$listInit$OutputCollect$hyper_fixed
which is TRUE and the attr(,"hypParamSamName")
has the names of the updated hyperparameters (including penalties, etc). Changing the code a little bit I was able to get around this error but the following question arises: should the penalties/lambda hyps be fixed or should be allow for the model to find new ones within the (unexisting) bounds freedom? If they are fixed I can commit a change that will fix it all; if not, then I have to fix it so that maybe we use the fixed value +- the bounds freedom in those specific hyps. Let me know your thoughts...
Thanks for digging. Theoretically refresh should also iterate through penalty. we just didn't have users going this far so far.
Alright, did some tweaks all around and I feel more confident on refreshing results. I've:
bounds_freedom
parameter so that users can overwrite the flexibility on hyperparameters bounds for refreshed models (default remains the same)@gufengzhou can you please check the Fitted vs actual calculations when refreshing? For some reason, the baseline is a large number and the predictions are small numbers (as if they haven't been scaled - check rowSums(xDecompVec)); so you get almost a straight line. Funny thing is that we haven't specifically changed anything there... unless I'm getting terrible fit because of the hyps constraints(?)
Hi @laresbernardo
Can you check this issue as well , I am getting way different result when I am doing data refresh (for 2 weeks of data) : https://github.com/facebookexperimental/Robyn/issues/985
@gufengzhou can you please check the Fitted vs actual calculations when refreshing? For some reason, the baseline is a large number and the predictions are small numbers (as if they haven't been scaled - check rowSums(xDecompVec)); so you get almost a straight line. Funny thing is that we haven't specifically changed anything there... unless I'm getting terrible fit because of the hyps constraints(?)
If you check report_aggregated.csv, you'll see that the intercept are wrong. It was dropped in the initial model but not dropped in the refresh. The dropping of intercept happens in the refit function within robyn_mmm @laresbernardo
Hi @laresbernardo @gufengzhou This is the same issue that I am facing as well : https://github.com/facebookexperimental/Robyn/issues/985 If you need any help with respect to testing, I can help you with this. (Robyn refresh is flawed currently because of this)
@gufengzhou can you please check the Fitted vs actual calculations when refreshing? For some reason, the baseline is a large number and the predictions are small numbers (as if they haven't been scaled - check rowSums(xDecompVec)); so you get almost a straight line. Funny thing is that we haven't specifically changed anything there... unless I'm getting terrible fit because of the hyps constraints(?)
Ok. It looks good now. @amanrai2508 would you mind updating Robyn to this branch (Robyn::robyn_update(ref = "bl02")
+ refresh R session) and testing the refresh functionality? Before we merge I'd like to hear from you. Thanks!
Hi @laresbernardo Now the refresh looks good. (For some data it is off but for most of the data it is working) Just wanted to know why we are plotting onepager of a solutionid where topsolution is False ( Though the results are similar in this solution as well)
Could you please check if model 1_142_5 is the solution with the lowest DECOMP.RSSD error across all Pareto-front models? The top solutions are the minimum combined error models per cluster, which may not match.
Yeah, it has the lowest DECOMP.RSSD. It looks fine on my end @laresbernardo ,now the values are more credible.
Fix on #960