CliMA / CalibrateEmulateSample.jl

Stochastic Optimization, Learning, Uncertainty and Sampling
https://clima.github.io/CalibrateEmulateSample.jl/dev
Apache License 2.0
84 stars 16 forks source link

Advice/protection against oddities in training point sets #268

Open odunbar opened 9 months ago

odunbar commented 9 months ago

Arising in PR #265 for example,

We find that sometimes emulator training is problematic for a fixed data set, and a small modification leads to massive improvements. More robust handling of the training dataest by e.g. providing more of a Cross validation procedure, or better construction of train/validation splits in the provided points may lead to more robust trainings.

odunbar commented 7 months ago

Adding this here, Another nice thing would be to add correlation/covariance/transformations to the data processing https://stats.stackexchange.com/questions/53/pca-on-correlation-or-covariance