aimodel_ss.approach: ClassifGaussianProcess or RegrGaussianProcess
aimodel_data_ss.autoregressive_n: 2, and
aimodel_data_ss.max_n_train: 500
Then run simulation: pdr sim my_ppss.yaml. It runs, successfully.
Then in a separate console, run plots: pdr sim_plots. It runs, the browser window pops open, it says "Updating" in the browser tab title bar, but nothing ever populates. This is the problem.
However, if I set max_n_train: 100 then the plots work successfully. I can inspect the model response plots and it's clearly nonlinear response surfaces (good).
It's a similar problem for approach: ClassifXgboost or RegrXgboost
Towards a solution
Datapoint: Both Gaussian Process models and Xgboost models take up significant memory. 100x+ more memory than the linear models that we've been using so far.
Datapoint: sim engine pickles the model. Then the plots use it for the model response surface plots
Hypothesis: the model's size footprint is too large for the plots
Where is the problem occurring? Possibilities include:
Cand A: Not properly pickling the model (due to size); it fails in un-pickling and everything freezes
Cand B: Too much bandwidth trying to transmit the model data from the pdr sim_plots server to the Plotly / Dash process running in the browser
Cand C: The model made it to the browser, but too much memory for a browser process to handle
Cand D: something else?
TODO
Either fix the problem, or have a workaround for when it happens.
If a workaround, outcome would be: all plots would properly render except model response is static (vs interactive based on clicking var impacts bars)
The bug / how to reproduce
Set up a simulation run, where:
aimodel_ss.approach: ClassifGaussianProcess
orRegrGaussianProcess
aimodel_data_ss.autoregressive_n: 2
, andaimodel_data_ss.max_n_train: 500
Then run simulation:
pdr sim my_ppss.yaml
. It runs, successfully.Then in a separate console, run plots:
pdr sim_plots
. It runs, the browser window pops open, it says "Updating" in the browser tab title bar, but nothing ever populates. This is the problem.However, if I set
max_n_train: 100
then the plots work successfully. I can inspect the model response plots and it's clearly nonlinear response surfaces (good).It's a similar problem for
approach: ClassifXgboost
orRegrXgboost
Towards a solution
Datapoint: Both Gaussian Process models and Xgboost models take up significant memory. 100x+ more memory than the linear models that we've been using so far.
Datapoint: sim engine pickles the model. Then the plots use it for the model response surface plots
Hypothesis: the model's size footprint is too large for the plots
Where is the problem occurring? Possibilities include:
pdr sim_plots
server to the Plotly / Dash process running in the browserTODO
Either fix the problem, or have a workaround for when it happens.