oceanprotocol / pdr-backend

Instructions & code to run predictoors, traders, more.
Apache License 2.0
22 stars 15 forks source link

[Bug, Sim plots] Plots don't work when Gaussian Process and > 200 training samples #1268

Open trentmc opened 1 week ago

trentmc commented 1 week ago

The bug / how to reproduce

Set up a simulation run, where:

Then run simulation: pdr sim my_ppss.yaml. It runs, successfully.

Then in a separate console, run plots: pdr sim_plots. It runs, the browser window pops open, it says "Updating" in the browser tab title bar, but nothing ever populates. This is the problem.

However, if I set max_n_train: 100 then the plots work successfully. I can inspect the model response plots and it's clearly nonlinear response surfaces (good).

It's a similar problem for approach: ClassifXgboost or RegrXgboost

Towards a solution

Datapoint: Both Gaussian Process models and Xgboost models take up significant memory. 100x+ more memory than the linear models that we've been using so far.

Datapoint: sim engine pickles the model. Then the plots use it for the model response surface plots

Hypothesis: the model's size footprint is too large for the plots

Where is the problem occurring? Possibilities include:

TODO

Either fix the problem, or have a workaround for when it happens.