Improvements for quadrature routines

albop commented 10 years ago

Since the quad branch is closed, I'm opening a new issue with the comments I made about it, so that we can start a fresh discussion. There are currently a few issues with the quadrature routines. The two items can have side-effects on future code and should be fixed quickly.

the gridmake function does not enumerate points in a way that is consistent with default Python conventions. If you compute gridmake(array([0,1]), ([-1,2])) it produces:

array([[ 0, -1],
       [ 1, -1],
       [ 0, -2],
       [ 1, -2],

meaning that the first order varies faster. Now, if you want to represent values on this grid by a 2d array vals such that vals[i,j] contains values, then when you do vals.ravel(), you don't enumerate points in the same order because last index is supposed to vary faster. This one is quite annoying.

Multidimensional functions don't always return an array with the same number of dimensions. That is not a problem in Matlab, but in Python, if I do quad.qnwnorm([2,2]) I get 2-dimensional arrays and with qnwnorm([2]) I get 1-dimensional vectors. This is problematic, since in generic code, it will force one to always distinguish dimension 1 from higher dimensions. My opinion here is that multidimensional routines, should always return multidimensional objects in a predictible fashion. (maybe the 1-d routines could be exported too)
(minor) It looks like qnwnorm will fail if one column of the covariance matrix (and the corresponding line) is full of zeros. That was already the case in the compecon toolbox, but it is still a common usecase.
(minor) in the future, we may want to have more engaging names than qnwnorm and co, don't we ?

A possible way to deal with these issues would be to rename quad.py into ce_quad.py so that the latter can be left untouched and remain as close as possible from the original version (including Fortran order). A quadrature.py could then contain Python compliant versions, possibly with more explicit names.

As for the gridmake replacement, the function cartesian does the required thing. It is also faster. (cf http://stackoverflow.com/questions/1208118/using-numpy-to-build-an-array-of-all-combinations-of-two-arrays)

Actually all these issues concern only the multidimensional functions.

cc7768 commented 10 years ago

Not sure I understand when you would use a column of zeros in a covariance matrix, doesn't that mean that the variance of that random variable would be zero?

I think everything else (especially the names) is a good idea for discussion.

albop commented 10 years ago

Imagine you have a model that depends on two i.i.d. gaussian shocks. You can write the covariance matrix as:

sigma = array([[sigma1, 0],[0, sigma2]])

Assume that you have written a routine to solve your model involving a discretization of the (trivial) multivariate distribution of shocks. Now, if you want to see what happens to your model, if the second shock is abbsent, you can just set sigma2=0 and run the same routine. Currently you will probably get an error because of the Cholesky factorization step.

On Wed, Jul 30, 2014 at 10:21 PM, cc7768 notifications@github.com wrote:

Not sure I understand when you would use a column of zeros in a covariance matrix, doesn't that mean that the variance of that random variable would be zero?

I think everything else (especially the names) is a good idea for discussion.

— Reply to this email directly or view it on GitHub https://github.com/jstac/quant-econ/issues/35#issuecomment-50673990.

sglyon commented 10 years ago

The current grid_make function in ce_utils was a very quick and dirty way to get just the functionality we needed for these routines to match the Matlab. I am happy to use a different function, I have actually had to use that cartesian function before.

If we do move away from gridmake and use cartesian I think that all the tests that compare the python and matlab versions will brake, as they currently assume a column-major ordering. Pablo, do you have any suggestions on what we can do to get around this issue?

In the end I would love to have tests that were checking for the "correct answer" rather than simply verifying that we match some other code. Does anyone have any references we can turn to for computing the correct answers for any of these routines? If we can separate our tests from the Matlab versions, then I would love to move on and make this feel more pythonic.

sglyon commented 10 years ago

I will look into the dimensionality issues and probably write some tests for them.

I haven't checked simple (yet important) things like that yet, so this is great feedback.

jstac commented 10 years ago

I'm in favor of changing the names to something more informative. Let's not tie ourselves down to exactly replicating CompEcon. As long as we implement broadly similar functionality and have good documentation I think it's fine to deviate when the deviation involves an obvious improvement.

Regarding the suggestion to

"rename quad.py into ce_quad.py so that the latter can be left untouched and remain as close as possible from the original version (including Fortran order). A quadrature.py could then contain Python compliant versions, possibly with more explicit names."

My preference would be to avoid duplication, just have one version of these routines and go for Python compliance and more explicit names. Let's break compatibility if it leads to improvement. As long as we have good documentation I think that's fine.

albop commented 10 years ago

About the reordering and subsequent testing issues. It's clear that current tests for multidimensional integration will break and will need to be rewritten. There are two simple ways to get around that:

reorder the results from Matlab so that they match the Python ordering. If T is a Matlab arrays, where rows enumerate a cross product in $R^{d1} \times R^{d2}$ in Matlab order, then you can define ind=array(range(d1*d2)).reshape(d1,d2,order='F')).flatten(order='C'). The reordered matrix corresponding to C order would then be T[ind,:] .
use some test functions and check that integrated values are the same using the weights and nodes supplied by Matlab or the ones computed in Python. This is a necessary condition only since it doesn't ensure that the new ordering is indeed conform, but it's easy to implement. Later, if you choose polynomials as test functions you should be able to compute some of the values in closed form.

albop commented 9 years ago

I've been looking at this issue again since the cartesian routine has been renamed. Replacing Fortran order by C order in quad.py seems very easy (as in replacing ckron(*weights[::-1]) by ckron(*weights) and gridmake by cartesian. Another idea would be to create a keyword argument order='C' so that Fortran order would still be an option. I have the feeling that it would require some restructuring of the quad code, and I don't know if it would be worth the trouble. @cc7768, @spencerlyon2 : since you wrote the first version, what are your thoughts about that ?

QuantEcon / QuantEcon.py

Improvements for quadrature routines #35