Open stefraynaud opened 11 years ago
Thanks Stephane,
Probably too late for 1.3 scheduled next week, but I'll take a look, otherwise I hope it makes in 1.4
C.
Stephane,
That's what I get for being so slow to take a real look at it, can you point/send me the file?
lib/python/vacumm/misc/grid/misc.py: unknown location HTTP Response Status 404 Not Found Python Traceback Traceback (most recent call last): File "/usr/share/gforge/www/scm/viewvc/lib/viewvc.py", line 3612, in main request.run_viewvc() File "/usr/share/gforge/www/scm/viewvc/lib/viewvc.py", line 314, in run_viewvc % self.where, '404 Not Found') ViewVCException: 404 Not Found: lib/python/vacumm/misc/grid/misc.py: unknown location
Powered By FusionForge
ok just found the file. will take a look
@stefraynaud the following seems to work, holding on for 2.2 until I can understandexactly what you mean:
import cdms2
import os
import sys
f=cdms2.open(os.path.join(sys.prefix,"sample_data","sampleCurveGrid4.nc"))
s=f("sample",latitude=(-20,20,"ccb"),longitude=(20,40,"ccb"))
print s.shape
@doutriaux1, in your example, the grid has a resolution close of several degrees. The problem arises when the grid has a higher resolution (<<1 degree).
Try this example:
import cdms2
import os
import sys
f=cdms2.open(os.path.join(sys.prefix,"sample_data","swan.four.nc"))
f('HS',lon=(-5.1,-4),lat=(48.42,49))
f.close()
@stefraynaud ok so that works for me, i get the following, are you saying it select too mch (lots of missing) ?
so you simply have more RAM than me:
>>> f('HS',lon=(-5.1,-4),lat=(48.42,49))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/site/data/yoch/raynaud/soft/uvcdat-latest/lib/python2.7/site-packages/cdms2/cudsinterface.py", line 45, in __call__
return v(*args, **kwargs)
File "/usr/site/data/yoch/raynaud/soft/uvcdat-latest/lib/python2.7/site-packages/cdms2/avariable.py", line 155, in __call__
grid=grid)
File "/usr/site/data/yoch/raynaud/soft/uvcdat-latest/lib/python2.7/site-packages/cdms2/selectors.py", line 151, in unmodified_select
mask, indexspecs = vargrid.intersect(specs)
File "/usr/site/data/yoch/raynaud/soft/uvcdat-latest/lib/python2.7/site-packages/cdms2/hgrid.py", line 612, in intersect
index = self.getIndex()
File "/usr/site/data/yoch/raynaud/soft/uvcdat-latest/lib/python2.7/site-packages/cdms2/hgrid.py", line 596, in getIndex
self._index_ = bindex.bindexHorizontalGrid(latlin, lonlin)
File "/usr/site/data/yoch/raynaud/soft/uvcdat-latest/lib/python2.7/site-packages/cdms2/bindex.py", line 17, in bindexHorizontalGrid
head = numpy.zeros(NI*NJ,dtype='l') # This should match NBINI, NBINJ in bindex.c
MemoryError
>>>
Check the memory you are using, and compare with your example.
@doutriaux1 @stefraynaud there was a pretty nasty bug in numpy <1.9.0 related to sorting which I found used an incredible amount of memory and CPU cycles on fairly large arrays (https://github.com/numpy/numpy/issues/4814) it would seem to me that the testing above is on a recent (2.0.0?) version which now includes numpy 1.9.0.. So you might find things have changed a little if the numpy issue was the problem..
@durack1 I think this bug is not related to any numpy issue, but to the (NI,NJ) you can see in the error message above. In fact, I'm indirectly responsible for this problem because it has been introduced by @doutriaux1 when I told him that geographical selections on curvilinear grid were not working due to the low resolution binning : @doutriaux1 increased the resolution to match the grid resolution, but the binning is still applied to the whole earth domain, and not a smaller domain just embedding the grid domain.
In addition to the memory issue, the selection operation is not as exact as in the rectangular case. This is why I implemented my own solution, which is sufficiently fast so it doesn't need a system with a cache like with bindex in cdms2.
@stefraynaud ok I get the issue now. Will try to track your file again and apply it. Thanks.
sadly moving to 2.3....
I tried @stefraynaud example and it worked with numpy 1.11 PR #2125 . Is this still an issue?
The problem highlighted in this ticket is a memory issue due to the algo to extract slices on a curvilinear grid. It has nothing to see with numpy.
As already reported in the devel branch, when one try to select a subregion of a variable on a curvilinear grid (2D axes) whose spatial resolution is high, the _bindex module creates a huge matrix of typical size (8_40_10e6/dxy)^2, where dxy is the spatial resolution if the grid, 40*10e6 is the perimeter of earth. The algorithm helps selection on small grid, but the bindex matrix convers the whole earth, whereas it should only overlap the current grid. A small intervention should fix it I guess.
In addition, the current algo does not allow selection using bounds, like (lon1, lon2, 'ccb'). It has been designed to be fast, but I'm not sure that these operation takes so much time. It wrote a function that convert coordinates selection to slices and mask for 1D or 2D axes, or grid (curvilinear or rectangular) : http://relay.actimar.fr/~raynaud/vacumm/library/misc.grid.misc.html#vacumm.misc.grid.misc.coord2slice The code is there : https://forge.ifremer.fr/scm/viewvc.php/lib/python/vacumm/misc/grid/misc.py?view=markup&root=vacumm
Maybe an optimal solution would be to merge the two approaches.