Closed lucheroni closed 3 years ago
I don't think this is a bug, I think this was a bug in 1.4.1 that was fixed in 1.5.0 by #1647.
Thus you think that all exact-multitask results I got using 1.4.1 are wrong?
Carlo
On Thu, Jul 8, 2021 at 4:54 PM Max Balandat @.***> wrote:
I don't think this is a bug, I think this was a bug in 1.4.1 that was fixed in 1.5.0 by #1647 https://github.com/cornellius-gp/gpytorch/pull/1647.
β You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cornellius-gp/gpytorch/issues/1689#issuecomment-876507960, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOMFUN7FEHZJB32BE7HTVNDTWW323ANCNFSM5AAYRYQA .
-- Carlo Lucheroni School of Sciences and Technologies Universita' di Camerino via Madonna delle Carceri 9 62032 Camerino (MC) Italy Office phone: 39-0737402552
-- Ask me for my Mathematical Finance and Stochastic Dynamic Optimization courseware: 'Take the Chance!'
View and download my papers on Finance and Physics from ResearchGate: https://www.researchgate.net/profile/Carlo_Lucheroni/publications
View and download my research on power markets on my SSRN Author page: http://ssrn.com/author=1390778
You can also check IDEAS/RePEc for my name: https://ideas.repec.org/f/plu234.html https://ideas.repec.org/
They're not wrong - it's just that the loss was improperly normalized - the scale of the loss is different now, but for the same parameterization the location of the optima is the same. The proper scaling may help resolve numerical issues / difficulties the optimizer may have.
Ok, thanks. I did notice that restarts now pop up with a different pattern.
A question - without passing through github, I don't know whether this is worthwhile.
Some time ago I wanted to check the quality of the multitask variational approximation in comparison to the exact approach. I wrote the code, then discovered that the variational LCMKernel doesn't behave the same as the exact one. I mean, I suspect that there are more parameters in the exact case (twice) than in the variational case, because of how the coregionalization matrices are defined. The exact case does include the diag matrix, the variational doesn't. I'm I wrong?
I tried to kill the diag by the 1.5.0. model.initialize() feature, but it doesn't work, I had instability problems.
My question is, do you plan to let users have direct access to the shape of the coregionalization matrices (letting them to include or not the diag) at the MultitaskGPModel class level?
Such a feature plus model.initialize() would also allow for checking directly the independent tasks limit in the exact case. One could zero the diag and set to unity the kron part, thus getting a multitask independent model directly inside the MultitaskGPModel class - and turn it on and off.
Regards,
Carlo
C
On Thu, Jul 8, 2021 at 5:43 PM Max Balandat @.***> wrote:
They're not wrong - it's just that the loss was improperly normalized - the scale of the loss is different now, but for the same parameterization the location of the optima is the same. The proper scaling may help resolve numerical issues / difficulties the optimizer may have.
β You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cornellius-gp/gpytorch/issues/1689#issuecomment-876545402, or unsubscribe https://github.com/notifications/unsubscribe-auth/AOMFUN3Z46F3N54JYX6QCCLTWXBQFANCNFSM5AAYRYQA .
-- Carlo Lucheroni School of Sciences and Technologies Universita' di Camerino via Madonna delle Carceri 9 62032 Camerino (MC) Italy Office phone: 39-0737402552
-- Ask me for my Mathematical Finance and Stochastic Dynamic Optimization courseware: 'Take the Chance!'
View and download my papers on Finance and Physics from ResearchGate: https://www.researchgate.net/profile/Carlo_Lucheroni/publications
View and download my research on power markets on my SSRN Author page: http://ssrn.com/author=1390778
You can also check IDEAS/RePEc for my name: https://ideas.repec.org/f/plu234.html https://ideas.repec.org/
I have checked my previous exact-multitask results obtained with 1.4.1 against what I'm getting with 1.5.0.
I fear that https://github.com/cornellius-gp/gpytorch/pull/1647 doesn't explain the difference, there is more to it. They are too much different. The gp minima don't match at all.
I've though that a possible way to pin the problem down could be to use the exact multitask gp setup with one task only, and to compare it with the exact gp (the univariate one) setup. I tried, and the two results don't match.
Since this could be possibly an issue specific to version 1.5.0, I'm going to open another GitHub issue. I hope the discussion will help other users - and take myself out of this.
I have checked my previous exact-multitask results obtained with 1.4.1 against what I'm getting with 1.5.0.I fear that #1647 doesn't explain the difference, there is more to it. They are too much different. The gp minima don't match at all.
Please post a reproducible code example.
I wrote the code, then discovered that the variational LCMKernel doesn't behave the same as the exact one.
This is true. There's probably a larger discussion to have about reconciling variational and exact multitask models.
π Bug
After I upgraded from GPyTorch version 1.4.1 to version 1.5.0, I noticed that I get two very different results when I run the same program under the two versions. This program makes use of the exact multitask setup. From it, I can get differences in loss values up to an order 100. Notice that I think that the problem is related only to the exact multitask classes. I made some quick tests with the single task case too, and I got no problem in that case.
It is not easy to prepare a code snippet to reproduce the problem, but I'll try. This is because each time the model is instantiated there are some parameters that are initialized randomly.
Whereas v. 1.4.1 doesn't allow to initialize parameters in the exact multitask case (an error which was eliminated in v. 1.5.0), v. 1.5.0 does.
Consider that before upgrading I took a safety copy of my conda environment. Hence I have available both my original env with v. 1.4.1 and a clone env of it with v. 1.5.0 as the only difference (I checked during the update that 'conda install -c conda-forge gpytorch' didn't change anything else except gpytorch itself).
The code hereafter should be run as follows. I'm having in mind that the test is made under Spyder. Prepare the two environments, identical except for gpytorch versions.
First, choose the appropriate v. 1.4.1 python interpreter from Tools, start a console, and run the code in it. Note down the values of cf and rw - these will be inserted 'by hand' when running the other version. Note down the loss value. Second, switch to the v. 1.5.1 interpreter (both by loading a suitable kernel and starting another console). Run the code. Note down the loss value.
I tried a few times, and I got always different values, of the same order of magnitude.
The example proposed is very small, just two tasks, one latent gp, and one basic one-parameter kernel type. Cf and rw are the only parameters that get random initialization. In my own program the tasks are 24, the latents are 24, the kernels are complicated and computation time is large. This is probably the reason why I get those huge differences. I also noticed that the larger is the numer of time points, the larger is the error. This is why I chose an example with six time points at least.
To reproduce
Code snippet to reproduce
Stack trace/error message
Expected Behavior
No difference between the two versions - identical output.
System information
Please complete the following information:
Additional context
Add any other context about the problem here.