facebook / prophet

Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
https://facebook.github.io/prophet
MIT License
18.33k stars 4.52k forks source link

fbprophet - solution for distributed computing #1662

Closed dharani-aws closed 3 years ago

dharani-aws commented 4 years ago

Hi @bletham We are using prophet to analyze and forecast large set of our application services ( total services volume is in millions per week ) . In our journey with prophet we are successfully using it to forecast univariate as well as multivariate forecasting ( add_aggressor) .As we started adding more applications to our ELK we find that performance of our python flask server which triggers the prophet modellings in the backend suffers huge latency - not to mention frequent HTTP 500, 503 errors. We now need to figure out a way to handle the scenario. Could you please advise the best architecture , tools to use ? Currently we are using a VM which has 15GB of memory . Most of the times kernel killed out python process as it ate up the whole memory+ swap memory . We restricted the memory usage to 1/3rd of available memory - this solved the kernel kill problem however the performance problem apparently still continues . Our use case is below , could you pls suggest a better compute solution ? USE CASE :

Note : We are planning to go for batch processing the historic data in order to avoid on-demand modelling . Even for this we need distributed computing .

Awaiting your valuable feedback Regards Dharani

bletham commented 4 years ago

My personal experience is pretty tied to facebook internal computing infra so I won't be very helpful in suggesting platforms for doing this most efficiently. Most of the applications I've worked on have not required on-demand forecasting, so latency isn't an issue and we've set up a recurring job that pulls the latest data, computes all of the forecasts, and stores them in a hive table. To the extent that this satisfies your needs, it would be a lot simpler than on-demand.

For on-demand forecasting, I have seen a service that spins of separate jobs for each forecast request, and then stores the results in a sql database, along the lines of what you describe. But the memory limitation does sound like an issue - Pystan is a bit memory intensive and I don't think there is anything you could do on the fbprophet side to be able to use a 2Gb container.

There are some descriptions on how to use prophet in Spark for high parallelism: #517 and #1283 might be helpful to you. There is also #686 about a flask service, but it sounds like what you have is already more sophisticated.