I just copied his notes from a file on slack to discuss them here.
ToDo:
Remember to add logging for variational optimization when it is added to VBMC proper.
Pain points:
(1, D) type pseudovectors are quite annoying everywhere
-> NumPy is more designed with 1D arrays in mind for vectors
-> negelcbo uses (D,) internally, just need to fix everywhere else and possible locations negelcbo touches at start
pylint is not used that much, and the pylint file is not the best
-> a replacement file exists, but is not commited anywhere
-> if there are lots of warnings for pylint then that discourages use, what is going from 1000 to 1001 warnings compared to going from 0 to 1?
-> for comparision gpyreg has only 10 pylint warnings currently
Things to investigate:
I was never able to replicate ADAM VP optimization result exactly when weight optimization was on. Is there some sort of bug
a) in ADAM?
-> seems unlikely, the code is too simple. if there is anything wrong it is in the early stopping routine
b) in gradient computations?
-> we can get exact matches in test cases, and originally I could only find the tiniest difference.
-> maybe this is just tiny numerical error that eventually causes divergence?
-> if so, then why didn't this happen without weight optimization?
c) in VP functions?
-> possible, but not my area. also, this would rear its head without weight optimization as well
Ideas:
it could be possible to define class functions in a different file, so we could simplify variational_optimization.py and gaussian_process_train
-> no need to pass around optim_state etc.
Notes:
If trying to get same floating point behaviour with MATLAB, remember that all random numbers that are generated must be exactly the same. This means fixing the random number seeds to be the same, but also replacing differing algorithms in both MATLAB and Python. To be specific, we have to replace the generation of normally distributed random numbers, generation of permutations and possibly more. The Python side for these is already in the library. Relatedly, commands such as "writematrix" and np.loadmatrix are your best friend, since they make it possible to compare the internal computations and find the spots where things go wrong. The tests and their data for VP optimization were generated in this way.
Beware of possible reference issues! Several bugs so far here and in gpyreg were caused by not making actual copies but copies to references.
I just copied his notes from a file on slack to discuss them here.
ToDo:
Remember to add logging for variational optimization when it is added to VBMC proper.
Pain points:
(1, D) type pseudovectors are quite annoying everywhere
pylint is not used that much, and the pylint file is not the best
Things to investigate:
I was never able to replicate ADAM VP optimization result exactly when weight optimization was on. Is there some sort of bug
Ideas:
Notes:
If trying to get same floating point behaviour with MATLAB, remember that all random numbers that are generated must be exactly the same. This means fixing the random number seeds to be the same, but also replacing differing algorithms in both MATLAB and Python. To be specific, we have to replace the generation of normally distributed random numbers, generation of permutations and possibly more. The Python side for these is already in the library. Relatedly, commands such as "writematrix" and np.loadmatrix are your best friend, since they make it possible to compare the internal computations and find the spots where things go wrong. The tests and their data for VP optimization were generated in this way.
Beware of possible reference issues! Several bugs so far here and in gpyreg were caused by not making actual copies but copies to references.