We discussed this in the meeting but forgot to add it as a task.
Show that when training a RGNP on draws from a GP with varying hyperparameters, it learns effectively a posterior distribution.
See Figure 4 and Section 5.1 of the neural diffusion process (NDP) paper to see how to do it, we can replicate exactly the same or some variation thereof (note that the OpenReview version is more recent than the arXiv version).
Ideally we show that a RGNP does well here, while a standard GNP struggles (we should see whether an AttnGNP does okay).
I mentioned RGNP / GNP / AttnGNP because the standard (R)CNP does not generate correlated predictions, needed here to figure out e.g. the length scale of the learnt process. (Although we could use the autoregressive procedure of the AR-CNP paper.)
It shouldn't be too hard to code up once we have an implementation of RGNP.
[X] Implement an experimental setup similar to NPD due: 05-03
[x] Show that training RGNP with varying hyperparamters effectively learns a posterior due:05-07
[x] Compare to GNP and AttnGNP due:05-07
[ ] Finetune the setup and make pretty figures depending on the evaluation outcome due:05-10
Status
The approach works in general, but has not yet achieved the very narrow histogram Dutordoir et al. claim to achieve.
Whether it improves still depends on seeds and hyperparameters, usually AGNP and RGNP improve upon GNP
Opinion: In its current setup it is more of a toy example. Could be more interesting if we switch to a quantitative comparison of ease to marginalize hyperparameters (final performance, computational cost,...)
We discussed this in the meeting but forgot to add it as a task.
It shouldn't be too hard to code up once we have an implementation of RGNP.