Closed imadoualid closed 1 year ago
Thanks for your interest in GPBoost!
For Gaussian processes, one needs to use an approximation for large data (not just in GPBoost). Otherwise, a matrix of size n x n is constructed which overflows the memory. You can try
gp_model = gpb.GPModel(gp_coords=X_train_coord, gp_approx ="vecchia")
(recommended)
or
gp_model = gpb.GPModel(gp_coords=X_train_coord, gp_approx ="tapering")
See also here for more details.
Depending on your computational resources and the data, a data set of size 180K might already be at the limit with gp_approx ="vecchia". You have to try... We are currently developing an alternative approximation that runs faster for which 180K should be no problem (should be ready in a few months).
Hello, i'm using gpboost to try and train a house prices model, i'm having this error
The code i'm running is
i think it's due to the data train which is of (179782, 82) but how to train for a large dataset then ?