thouis / numpy-trac-migration

numpy Trac to github issues migration
2 stars 3 forks source link

Enhance meshgrid to generate 3D grids + add option for sparse grids (migrated from Trac #966) #3768

Open thouis opened 11 years ago

thouis commented 11 years ago

Original ticket http://projects.scipy.org/numpy/ticket/966 Reported 2008-12-09 by atmention:pbrod, assigned to atmention:charris.

I would suggest to replace numpy's meshgrid with the meshgrid found in SciTools at:

http://code.google.com/p/scitools/

for the following reasons:

The meshgrid function in NumPy is limited to two dimensions only, while the SciTools version can also work with 3D grids. In addition, the NumPy version of meshgrid has no option for generating sparse grids to conserve memory, like it is in SciTools by specifying the sparse argument:

xv, yv = meshgrid(linspace(-2,2,5), linspace(-1,1,3), sparse=True) xv array([[-2., -1., 0., 1., 2.]]) yv array([[-1.], [ 0.], [ 1.]])

Actually, this is the default behavior for the meshgrid function in SciTools. In NumPy, however, we will in this case get a full 2D grid:

xv, yv = numpy.meshgrid(linspace(-2,2,5), linspace(-1,1,3)) xv array([[-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.], [-2., -1., 0., 1., 2.]]) yv array([[-1., -1., -1., -1., -1.], [ 0., 0., 0., 0., 0.], [ 1., 1., 1., 1., 1.]])

This is the same result we get by setting sparse=False in meshgrid in SciTools.

The NumPy functions mgrid and ogrid does provide support for, respectively, full and sparse n-dimensional meshgrids, however, these functions uses slices to generate the meshgrids rather than one-dimensional coordinate arrays such as in Matlab. With slices, the user does not have the option to generate meshgrid with, e.g., irregular spacings, like

x = array([-1,-0.5,1,4,5], float) y = array([0,-2,-5], float) xv, yv = meshgrid(x, y, sparse=False) xv array([[-1. , -0.5, 1. , 4. , 5. ], [-1. , -0.5, 1. , 4. , 5. ], [-1. , -0.5, 1. , 4. , 5. ]]) yv array([[ 0., 0., 0., 0., 0.], [-2., -2., -2., -2., -2.], [-5., -5., -5., -5., -5.]])

In addition to the reasons mentioned above, the meshgrid function in NumPy supports only Cartesian indexing, i.e., x and y, not matrix indexing, i.e., rows and columns (mgrid and ogrid supports only matrix indexing). The meshgrid function in SciTools supports both indexing conventions through the indexing keyword argument. Giving the string 'ij' returns a meshgrid with matrix indexing, while 'xy' returns a meshgrid with Cartesian indexing. The difference is illustrated by the following code snippet: nx = 10 ny = 15

x = linspace(-2,2,nx)
y = linspace(-2,2,ny)

xv, yv = meshgrid(x, y, sparse=False, indexing='ij')
for i in range(nx):
    for j in range(ny):
        # treat xv[i,j], yv[i,j]

xv, yv = meshgrid(x, y, sparse=False, indexing='xy')
for i in range(nx):
    for j in range(ny):
        # treat xv[j,i], yv[j,i]

It is not entirely true that matrix indexing is not supported by the meshgrid function in NumPy because we can just switch the order of the first two input and output arguments: yv, xv = numpy.meshgrid(y, x) is the same as xv, yv = meshgrid(x, y, sparse=False, indexing='ij') However, we think it is clearer to have the logical "x, y" sequence on the left-hand side and instead adjust a keyword argument.

thouis commented 11 years ago

Comment in Trac by atmention:pbrod, 2009-07-05

thouis commented 11 years ago

Comment in Trac by atmention:pbrod, 2009-07-06

thouis commented 11 years ago

Attachment in Trac by atmention:pbrod, 2009-07-14: meshgrid.py

thouis commented 11 years ago

Comment in Trac by trac user jludlow, 2010-07-15

I would offer that it would dramatically simplify the user model to make mgrid, ogrid, and meshgrid share the same default indexing scheme, be it cartesian or matrix. Right now you have to keep a mental note-to-self that they don't behave the same way, so if you decide to replace one with the other you have to add a transpose to the results to preserve indexing intent for surrounding code. I've been been bitten myself.

thouis commented 11 years ago

Comment in Trac by atmention:mwiebe, 2011-03-24

thouis commented 11 years ago

Comment in Trac by atmention:pbrod, 2011-11-08

The timings for the original version:

In [10]: x = np.arange(10)
In [11]: %timeit X,Y = np.meshgrid(x,x)
100000 loops, best of 3: 9.79 us per loop

compared to the new proposal: In [12]: %timeit X,Y = meshgrid(x,x) 10000 loops, best of 3: 33.4 us per loop

In [13]: %timeit X,Y = meshgrid(x,x, sparse=True)
10000 loops, best of 3: 18.3 us per loop

In [14]: %timeit X,Y = meshgrid(x,x, sparse=True, copy=False)
10000 loops, best of 3: 15 us per loop

In [15]: %timeit X,Y = meshgrid(x,x, sparse=True, copy=False, indexing='ij')
100000 loops, best of 3: 12.3 us per loop
thouis commented 11 years ago

Comment in Trac by atmention:pbrod, 2011-11-08

However, the timings for the new version is even better when the size of the input vectors are larger. The timings for the original version with size=100: In [36]: x = np.arange(100)

In [37]: %timeit X,Y = np.meshgrid(x,x)
10000 loops, best of 3: 96.8 us per loop

compared to the new proposal: In [38]: %timeit X,Y = meshgrid(x,x, sparse=False, copy=True, indexing='xy') 10000 loops, best of 3: 109 us per loop

In [39]: %timeit X,Y = meshgrid(x,x, sparse=True, copy=True, indexing='xy')
100000 loops, best of 3: 18.5 us per loop

In [40]: %timeit X,Y = meshgrid(x,x, sparse=True, copy=True, indexing='ij')
100000 loops, best of 3: 15.5 us per loop

In [41]: %timeit X,Y = meshgrid(x,x, sparse=True, copy=False, indexing='ij')
10000 loops, best of 3: 12 us per loop

In [42]: %timeit X,Y = meshgrid(x,x, sparse=False, copy=False, indexing='ij')
10000 loops, best of 3: 62.2 us per loop
thouis commented 11 years ago

Comment in Trac by atmention:rgommers, 2011-11-11

Looks good. As I said on the ML, +1 on merging this after adding tests. Some comments on the patch:

thouis commented 11 years ago

Attachment in Trac by atmention:pbrod, 2011-11-14: 0001-Enhanced-meshgrid-to-generate-3D-grids-add-option-fo.patch

thouis commented 11 years ago

Comment in Trac by atmention:pbrod, 2011-11-14

Added a new patch that incorporates the suggestions from Ralf and includes tests for the sparse=True and indexing=ij cases.

thouis commented 11 years ago

Comment in Trac by atmention:rgommers, 2011-12-13

I was adding a test for >2 input arrays, and found this confusing: In [18]: [xx, yy, zz, qq] = np.meshgrid([1, 2, 3], [5, 6], [8], [10, 11, 12, 13])

In [19]: xx.shape
Out[19]: (2, 3, 1, 4)

This is due to the default 'xy' indexing, but does that make sense in case of >2 inputs?

thouis commented 11 years ago

Comment in Trac by atmention:rgommers, 2011-12-13

Put this up at https://github.com/rgommers/numpy/tree/meshgrid3d with some minor changes.

Still needs tests for >2 inputs.

I removed the "input as list" and ndgrid. Both don't really add any functionality.

thouis commented 11 years ago

Comment in Trac by atmention:pbrod, 2011-12-22

I have 2 comments to https://github.com/rgommers/numpy/tree/meshgrid3d. The first one relates to cartesian indexing for ndim>2: In Matlab the meshgrid function only accepts up to three inputs and it always return the variables with a cartesian indexing scheme (i.e., where the 2 first dimensions are switched). For ndim>3 in matlab you must use the ndgrid function, which returns the variables with a matrix indexing scheme. One could switch to matrix indexing if ndim>3, but I think that will be even more confusing. Another option is to do like Matlab:

This is both simpler and more explicit. However, a third option is to leave it as it is if adding ndgrid is too much.

The second comment is about removing the test for "input as list" Removing the the test for the "input as list" in https://github.com/rgommers/numpy/tree/meshgrid3d have the effect that a single input array is interpreted as multidimensional input as shown in the following example:

In [21]: x = np.array([1,2])
In [22]: meshgrid(x)
Out[22]: [array([[1]]), array([[2]])]

gives the same as In [23]: meshgrid(x[0],x[1]) Out[23]: [array([[1]]), array([[2]])] Is this a wanted feature for meshgrid? I think this is confusing. I would expect meshgrid(x) to raise an error message in this situation just like the call to meshgrid(1) does.

In [24]: meshgrid(x,x)
Out[24]:
[array([[1, 2],
       [1, 2]]), array([[1, 1],
       [2, 2]])]

In [25]: y = 1

In [26]: meshgrid(y)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)

C:\Documents and Settings\pab\.xy\startups\<ipython console> in <module>()

C:\Documents and Settings\pab\.xy\startups\<string> in meshgrid(*xi, **kwargs)

TypeError: meshgrid() takes 2 or more arguments (1 given)

In [27]: meshgrid(y,y)
Out[27]: [array([[1]]), array([[1]])]
thouis commented 11 years ago

Comment in Trac by atmention:pbrod, 2011-12-22

thouis commented 11 years ago

Comment in Trac by atmention:rgommers, 2011-12-28

I agree that for a single list of arrays as input, it should raise the same error as for a single scalar or array. Changed this and added a test.

thouis commented 11 years ago

Comment in Trac by atmention:rgommers, 2011-12-28

Currently the output shapes are actually consistent with Matlab for 2-D and 3-D, so I documented/tested some more and left it as is. Cartesian with >3 dimensions is a bit odd anyway, so as long as it's clear what the expected shape is then I think it's fine as it is now.

Could you review the latest commits?

thouis commented 11 years ago

Comment in Trac by atmention:pbrod, 2012-01-13

The current implementation in https://github.com/rgommers/numpy/tree/meshgrid3d looks good to me. ;-) +1

thouis commented 11 years ago

Comment in Trac by atmention:rgommers, 2012-01-31

Sent PR: https://github.com/numpy/numpy/pull/192