powellb / seapy

State Estimation and Analysis in Python
MIT License
28 stars 21 forks source link

Segmentation fault in seapy.roms.interp.__interp_grids lines 261-314 #23

Closed jason-tilley closed 8 years ago

jason-tilley commented 8 years ago

I am trying to create a climatology file in seapy using a grid created using pyroms. I am getting a segmentation fault using Python 3.5.1 on OS X El Cap. I have traced the error as far as I can, and it appears to be coming from lines 261-314 in interp.py. It may be occurring in __interp2_thread or interp3_thread, but I am unfamiliar with joblib to troubleshoot any further. Can anybody else reproduce this? It might be caused by an error in my grid, but I thought I'd share the segmentation fault anyway. On a side note... Are there any plans for a Python 2.x version? I would love to be able to use this side-by-side with my pyroms installation. The output I am getting is below is repeated several times before Python quits. Thanks.

/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/numpy/lib/shape_base.py:873: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future return c.reshape(shape_out) /Users/redacted/src/seapy/roms/interp.py:90: UserWarning: nx or ny values are too large for stable OA, 71.000000 warn("nx or ny values are too large for stable OA, {:f}".format(ksize)) /Users/redacted/src/seapy/lib.py:195: MaskedArrayFutureWarning: setting an item on a masked array which has a shared mask will not copy the mask and also change the original mask array in the future. Check the NumPy 1.11 release notes for more information. fld[lst] = nfld[lst] / count[lst]

powellb commented 8 years ago

We use too many python 3 idioms to be able to go back to python 2.

A couple of things:

1) the warning about nx or ny being too large you should take heed. The results may be strange. The nx/ny values are decorrelation length scales that you wish to apply to the interpolation. Since it uses the lat/lon fields to interpolate, the units are in degrees.

2) The warning about the the mask array is new to me. I am using numpy 1.10.4. Which version are you using? 1.11? I will have to investigate that error; however, that shouldn't cause the interpolation to fail.

3) I don't see anything in your output there about a segfault. So, for the segfault, did you do a 'make all' in the seapy directory. This requires a fortran compiler to build the interpolator. You should end up with oalib.so in your seapy directory. Is that there? Could you test it explicitly:

x=np.arange(0,10) y=np.arange(0,10) x,y = np.meshgrid(x,y) z=np.randn(10,10) nx = np.arange(0,10,0.2) ny = np.arange(0,10,0.2) nx,ny = np.meshgrid(nx, ny) nz, pmap = seapy.oasurf(x, y, z, nx, ny)

Does that work?

4) The other thing is that we do something a bit non-standard: we put the Vtransform, Vstretching, theta_s, theta_b, etc. variables into the grid .nc file. The interpolators rely on the depth_rho,u,v, etc. fields in the grid class to work. The grid class tries to load the s_coordinate fields to build those fields, but you would have to add the fields, then do a set_depth

grid = seapy.model.asgrid(filename) grid.vstretching = ... grid.set_depth()

5) For your testing, you can just focus on zeta to make sure everything is working: seapy.roms.interp.to_clim(..., vmap={"zeta":"zeta"}). It should be very fast and the error (if it occurs) will be right away.

jason-tilley commented 8 years ago

Thanks for the help. Yes I am using numpy 1.11. I will certainly look more into the decorrelation length scales. As for the code you provided. It does work. Your 4th point is very helpful. I had noticed it was looking for depth_rho and thick_rho, which weren't in my grid. I had to set those manually (following your code) since mygrid.set_depth() didn't seem to work as expected. Probably user error. However, I had not set vstretching, so maybe that will help. I'll update you once I do that.

jason-tilley commented 8 years ago

So I set my grid vstretching and vtransform, and now it will create the depth_rho and thickness values. So that's progress. If I run with vmap={"zeta":"zeta"} the command runs fine. However, if I set all the variables, Python still "quits unexpectedly". However, if I control-c I am still in Python and the climatology file is created. However, it seems most of the fields are still empty (not zeta however). Once again, probably user error here. Maybe the output after the control-c will help.

^CProcess ForkPoolWorker-30: Process ForkPoolWorker-29: Traceback (most recent call last): File "<stdin>", line 5, in <module> File "/Users/username/src/seapy/roms/interp.py", line 783, in to_clim nx=nx, ny=ny, vmap=vmap, weight=weight, pmap=pmap) File "/Users/username/src/seapy/roms/interp.py", line 310, in __interp_grids for i in recs), copy=False) File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/joblib/parallel.py", line 810, in __call__ self.retrieve() File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/joblib/parallel.py", line 757, in retrieve raise exception File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/joblib/parallel.py", line 727, in retrieve self._output.extend(job.get()) File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 602, in get self.wait(timeout) File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/multiprocessing/pool.py", line 599, in wait self._event.wait(timeout) File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 549, in wait signaled = self._cond.wait(timeout) File "/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/threading.py", line 293, in wait waiter.acquire() KeyboardInterrupt

jason-tilley commented 8 years ago

The problem was indeed user error. My grid was depth negative and left the water at points. After fixing the grid, the segmentation fault stopped. Thanks for the help.