Derivative GPs are not training on derivative information. The derivative GPs, like those in the tutorial linked below are not actually training on the derivative information, they are just evaluating the derivative in model.eval().
The below code is exactly the code in the tutorial, except I have added some extra code to print some values to show the bug in the comment flags ## mll debug ##.
This code is doing the exact same thing as -mll(output, train_y), but I'm making it explicit to get the intermediary values.
We notice that printing the mean that comes out of the output print(mean[:12].reshape(4, 3)) both derivatives (which appear 2 of every 3 terms, hence the reshape to make it more readable) are both zero, there is no reason this should be the case unless the data is constant, which it isn't.
This means that the diff tensor will always have incorrect values for the derivative outputs. Using this bad diff the rest of my injection snippet calculates the negative marginal log likelihood.
Running the script, the reported loss using loss = -mll(output, train_y) yields the same loss as my calculations. Meaning that this error is also in the built in code.
In my own further testing I have noticed that calling model.eval(), likelihood.eval() will correct this error and the mean output will be as expected, this is how the tutorial is able to produce good plots of the derivatives.
🐛 Bug
Derivative GPs are not training on derivative information. The derivative GPs, like those in the tutorial linked below are not actually training on the derivative information, they are just evaluating the derivative in
model.eval()
.To reproduce
The bug exists in the tutorial on 2d derivatives: https://docs.gpytorch.ai/en/stable/examples/08_Advanced_Usage/Simple_GP_Regression_Derivative_Information_2d.html
The below code is exactly the code in the tutorial, except I have added some extra code to print some values to show the bug in the comment flags
## mll debug ##
.This code is doing the exact same thing as
-mll(output, train_y)
, but I'm making it explicit to get the intermediary values.We notice that printing the mean that comes out of the output
print(mean[:12].reshape(4, 3))
both derivatives (which appear 2 of every 3 terms, hence the reshape to make it more readable) are both zero, there is no reason this should be the case unless the data is constant, which it isn't.This means that the
diff
tensor will always have incorrect values for the derivative outputs. Using this baddiff
the rest of my injection snippet calculates the negative marginal log likelihood.Running the script, the reported loss using
loss = -mll(output, train_y)
yields the same loss as my calculations. Meaning that this error is also in the built in code.In my own further testing I have noticed that calling
model.eval()
,likelihood.eval()
will correct this error and the mean output will be as expected, this is how the tutorial is able to produce good plots of the derivatives.System information