Modelling Process takes extremely long

bart-vanvlerken commented 1 year ago

Hi everyone!

A client has daily data running back to 2013. I'm limiting my modelling window to last year in order to capture current business practices, but using the rest of the data to calculate trend/seasonality figures. However, I found that by using data this far back severely impacts modelling speed. Using data running back to 2013, one iteration takes about half a minute. If I include only the last year worth of data (full dataset = modelling window), an iteration only takes about a second. I appreciate using the extra data for modeling seasonality/trend more accurately, but the massive increase in time required is not ideal. I understand that having patience is the logical solution here, but I was wondering if something could be changed in the code to allow for a faster modelling process when including a lot of historical data. I only really use the historical data of my target variable to model seasonality/trend, the other variables (channels, context variables etc) are simply coded as 0 since they're not used for modeling seasonality/trend and they're not included in my modelling window anyway. Hope someone can share their insights or similar experiences on this :)

MaximSat commented 1 year ago

Yeah, I feel you. 7 trials on 5000 iterations takes 2-3 hours with Weibull parameters to train model. I am now on finding the best window start/end range and the best hyperparameters to specify. So, with 3-4 shots per day I think I stuck for at least a month. least. Wasn't that way before though. May be some updates has affected on something.

gufengzhou commented 1 year ago

Hi @bart-vanvlerken, thanks for reporting this. I'm seeing 10 years of daily data for the first time. It surprises me that the impact is so large. I can imagine it might be because of the for loop in adstocking. How about using only the last 3-4 years? It's enough for trend/season.

gufengzhou commented 1 year ago

Yeah, I feel you. 7 trials on 5000 iterations takes 2-3 hours with Weibull parameters to train model. I am now on finding the best window start/end range and the best hyperparameters to specify. So, with 3-4 shots per day I think I stuck for at least a month. least. Wasn't that way before though. May be some updates has affected on something.

Sorry to hear that. Are you saying 7x5k was faster before? Or can you pin point which steps are slower than before?

bart-vanvlerken commented 1 year ago

@gufengzhou Yes I've seen minimal change in trend/seasonality patterns when using 3-4 years of data compared to the whole dataset, so that's exactly what I'm doing now. Luckily it's no longer an issue for me, but I just wanted to give the heads up that it takes significantly longer when modelling on a larger dataset. I ran an experiment last night and a single trial with 2000 iterations (Weibull PDF) took 8 hours when using the full 10 years of data.

facebookexperimental / Robyn

Modelling Process takes extremely long #668