Open albop opened 10 years ago
Not sure I understand when you would use a column of zeros in a covariance matrix, doesn't that mean that the variance of that random variable would be zero?
I think everything else (especially the names) is a good idea for discussion.
Imagine you have a model that depends on two i.i.d. gaussian shocks. You can write the covariance matrix as:
sigma = array([[sigma1, 0],[0, sigma2]])
Assume that you have written a routine to solve your model involving a discretization of the (trivial) multivariate distribution of shocks. Now, if you want to see what happens to your model, if the second shock is abbsent, you can just set sigma2=0 and run the same routine. Currently you will probably get an error because of the Cholesky factorization step.
On Wed, Jul 30, 2014 at 10:21 PM, cc7768 notifications@github.com wrote:
Not sure I understand when you would use a column of zeros in a covariance matrix, doesn't that mean that the variance of that random variable would be zero?
I think everything else (especially the names) is a good idea for discussion.
— Reply to this email directly or view it on GitHub https://github.com/jstac/quant-econ/issues/35#issuecomment-50673990.
The current grid_make
function in ce_utils
was a very quick and dirty way to get just the functionality we needed for these routines to match the Matlab. I am happy to use a different function, I have actually had to use that cartesian
function before.
If we do move away from gridmake
and use cartesian
I think that all the tests that compare the python and matlab versions will brake, as they currently assume a column-major ordering. Pablo, do you have any suggestions on what we can do to get around this issue?
In the end I would love to have tests that were checking for the "correct answer" rather than simply verifying that we match some other code. Does anyone have any references we can turn to for computing the correct answers for any of these routines? If we can separate our tests from the Matlab versions, then I would love to move on and make this feel more pythonic.
I will look into the dimensionality issues and probably write some tests for them.
I haven't checked simple (yet important) things like that yet, so this is great feedback.
I'm in favor of changing the names to something more informative. Let's not tie ourselves down to exactly replicating CompEcon. As long as we implement broadly similar functionality and have good documentation I think it's fine to deviate when the deviation involves an obvious improvement.
Regarding the suggestion to
"rename quad.py into ce_quad.py so that the latter can be left untouched and remain as close as possible from the original version (including Fortran order). A quadrature.py could then contain Python compliant versions, possibly with more explicit names."
My preference would be to avoid duplication, just have one version of these routines and go for Python compliance and more explicit names. Let's break compatibility if it leads to improvement. As long as we have good documentation I think that's fine.
About the reordering and subsequent testing issues. It's clear that current tests for multidimensional integration will break and will need to be rewritten. There are two simple ways to get around that:
T
is a Matlab arrays, where rows enumerate a cross product in $R^{d1} \times R^{d2}$ in Matlab order, then you can define ind=array(range(d1*d2)).reshape(d1,d2,order='F')).flatten(order='C')
. The reordered matrix corresponding to C order would then be T[ind,:]
.I've been looking at this issue again since the cartesian
routine has been renamed. Replacing Fortran order by C order in quad.py
seems very easy (as in replacing ckron(*weights[::-1])
by ckron(*weights)
and gridmake
by cartesian
.
Another idea would be to create a keyword argument order='C'
so that Fortran order would still be an option. I have the feeling that it would require some restructuring of the quad
code, and I don't know if it would be worth the trouble. @cc7768, @spencerlyon2 : since you wrote the first version, what are your thoughts about that ?
Since the quad branch is closed, I'm opening a new issue with the comments I made about it, so that we can start a fresh discussion. There are currently a few issues with the quadrature routines. The two items can have side-effects on future code and should be fixed quickly.
gridmake
function does not enumerate points in a way that is consistent with default Python conventions. If you computegridmake(array([0,1]), ([-1,2]))
it produces:meaning that the first order varies faster. Now, if you want to represent values on this grid by a 2d array
vals
such thatvals[i,j]
contains values, then when you do vals.ravel(), you don't enumerate points in the same order because last index is supposed to vary faster. This one is quite annoying.quad.qnwnorm([2,2])
I get 2-dimensional arrays and with qnwnorm([2]) I get 1-dimensional vectors. This is problematic, since in generic code, it will force one to always distinguish dimension 1 from higher dimensions. My opinion here is that multidimensional routines, should always return multidimensional objects in a predictible fashion. (maybe the 1-d routines could be exported too)qnwnorm
and co, don't we ?A possible way to deal with these issues would be to rename quad.py into ce_quad.py so that the latter can be left untouched and remain as close as possible from the original version (including Fortran order). A
quadrature.py
could then contain Python compliant versions, possibly with more explicit names.As for the
gridmake
replacement, the functioncartesian
does the required thing. It is also faster. (cf http://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays)Actually all these issues concern only the multidimensional functions.