Closed benwulfe closed 1 year ago
Problem: XGBoost does not produce variance (edit: Nicholas says this is possible but likely technically too challenging to be worth investing in). Nicholas is not aware of any way to use XGBoost.
0: Put link to gaussian blur code here First Step: Analyze the current results for variance of using XGBoost. Then investigate alternatives.
Nicholas thinks the parameters for Gaussian blur are not great. Radius might be too large.
Kriging does its own variance estimation. variational regressors are a standard class of models used for this, Nicholas' peers suggest this is the best approach. Class of variational regressors is large.
Variational models might be more challenging to serve, but might be the corret technical approach.
I tried a GPR model. The colab is here: https://colab.research.google.com/github/tnc-br/ddf-isoscapes/blob/main/gpr/gaussian_process_regressor.ipynb https://github.com/tnc-br/ddf-isoscapes/blob/main/gpr/gaussian_process_regressor.ipynb
It doesn't seem to interpret areas it wasn't trained on well:
That's not surprising. Kriging interpolation is the GPR model we are currently using as a baseline. I believe Christian at TNC owns that code. Have you made any progress with the ML variational inference approach?
-Nicholas
On Wed, Jun 7, 2023 at 3:16 PM Ruben Madera @.***> wrote:
I tried a GPR model. The colab is here:
https://github.com/tnc-br/ddf-isoscapes/blob/main/gpr/gaussian_process_regressor.ipynb
It doesn't seem to interpret areas it wasn't trained on well: image.png (view on web) https://github.com/tnc-br/ddf-isoscapes/assets/8887440/e485d738-580a-4d87-9092-0e9fe1fde5eb
— Reply to this email directly, view it on GitHub https://github.com/tnc-br/ddf-isoscapes/issues/32#issuecomment-1581373864, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABDTDN33JNHAFL3ABSO6VHTXKDHRPANCNFSM6AAAAAAYFW2MAU . You are receiving this because you are subscribed to this thread.Message ID: @.***>
I merged the new variational model with the master branch and can be viewed here: https://colab.research.google.com/github/tnc-br/ddf-isoscapes/blob/main/dnn/variational.ipynb
Next, I want to retrain the model with the newest UC Davis data.
As mentioned in #85, I did not see much improvement upon training with real (and calibrated) data.
dedupe
Right now, Variance is computed with a gaussian blur. We should revisit this and attempt a simple connected neural net with 2-3 layers and variational regressor head.
We can take a look at XGBoost but is not recommended. More likely, a variational regressor that is not ML would be the most straightforward approach here.