CNPs are meant to also infer uncertainty in their data. This may have to roots:
Lack of data
Noisy data
When it comes to the latter, the emulator appears to be working well: test results usually fall within 3 stds. The problem is the latter. The model's performance to extrapolate uncertainty appears to be very low, almost just following the trend observed over our training data.
Reproduction steps
Reproducing the issue is simple:
- generate data (1D 3rd order polynomial over latin hypercube points) + add some noise
- fit a CNP over the data (can use `KFold` to obtain different to start with different random stats and also check model generalisability)
- infer mean + std over both within the data interval and outside
Observed uncertainty in the extrapolation is not generalisable:
- it is absent for one or both of the tails
- it differs between folds
- using more data with the same interval does not help
Description
CNPs are meant to also infer uncertainty in their data. This may have to roots:
When it comes to the latter, the emulator appears to be working well: test results usually fall within 3 stds. The problem is the latter. The model's performance to extrapolate uncertainty appears to be very low, almost just following the trend observed over our training data.
Reproduction steps
Version
0.1.0.post1
Screenshots
OS
Linux