Open lgloege opened 5 years ago
It dies at which step? Does the rectilinear example work? Can you import ESMPy by import ESMF
?
Yes, no issues with import ESMF
the kernel dies here:
regridder = xe.Regridder(ds, ds_out, 'bilinear')
regridder
I guess it is a system-specific ESMPy installation issue. Can you run the code on your own laptop, or on other machines?
Alternatively you can run the test suite to see which part dies:
pip install pytest
git clone https://github.com/JiaweiZhuang/xESMF.git
cd xesmf
pytest -v xesmf
You should see something like this if all tests succeed:
xesmf/tests/test_backend.py::test_flag PASSED [ 4%]
xesmf/tests/test_backend.py::test_warn_f_on_array PASSED [ 9%]
xesmf/tests/test_backend.py::test_warn_f_on_grid PASSED [ 14%]
xesmf/tests/test_backend.py::test_warn_lat_range PASSED [ 19%]
xesmf/tests/test_backend.py::test_esmf_grid_with_corner PASSED [ 23%]
xesmf/tests/test_backend.py::test_esmf_build_bilinear PASSED [ 28%]
xesmf/tests/test_backend.py::test_regrid PASSED [ 33%]
xesmf/tests/test_backend.py::test_regrid_periodic_wrong PASSED [ 38%]
xesmf/tests/test_backend.py::test_regrid_periodic_correct PASSED [ 42%]
xesmf/tests/test_frontend.py::test_as_2d_mesh PASSED [ 47%]
xesmf/tests/test_frontend.py::test_build_regridder PASSED [ 52%]
xesmf/tests/test_frontend.py::test_existing_weights PASSED [ 57%]
xesmf/tests/test_frontend.py::test_conservative_without_bounds PASSED [ 61%]
xesmf/tests/test_frontend.py::test_build_regridder_from_dict PASSED [ 66%]
xesmf/tests/test_frontend.py::test_regrid PASSED [ 71%]
xesmf/tests/test_frontend.py::test_regrid_periodic_wrong PASSED [ 76%]
xesmf/tests/test_frontend.py::test_regrid_periodic_correct PASSED [ 80%]
xesmf/tests/test_frontend.py::test_regrid_with_1d_grid PASSED [ 85%]
xesmf/tests/test_util.py::test_grid_global PASSED [ 90%]
xesmf/tests/test_util.py::test_grid_global_bad_resolution PASSED [ 95%]
xesmf/tests/test_util.py::test_cell_area PASSED [100%]
I am having the same issue. I followed the instructions above and the test just 'dropped out' after the second...not sure how to diagnose this problem further.
platform darwin -- Python 3.6.7, pytest-4.0.1, py-1.7.0, pluggy-0.8.0 -- /Users/juliusbusecke/miniconda/envs/standard/bin/python
cachedir: .pytest_cache
rootdir: /Users/juliusbusecke/xESMF, inifile:
plugins: remotedata-0.3.1, openfiles-0.3.1, doctestplus-0.1.3, cov-2.6.0, arraydiff-0.2
collected 20 items
xesmf/tests/test_backend.py::test_flag PASSED [ 5%]
xesmf/tests/test_backend.py::test_warn_f_on_array PASSED [ 10%]
xesmf/tests/test_backend.py::test_warn_f_on_grid %
I tried this on my laptop, will check on one of our clusters.
@jbusecke I was not able to resolve the problem on NCAR's cheyenne or Columbia's habanero cluster. However, I was able to get it working a local server. @raphaeldussin believes it's an ESMPy issue. Check to make sure you and import esmpy. @raphaeldussin , do you have any comments on this?
I just got it to run on princetons tiger. I was able to import esmpy on my local machine before.
the last time we pulled esmpy from conda on @lgloege machine, it failed to run import ESMF
properly.
I was under the impression that the conda recipe was broken. I am trying to reinstall now in a clean conda environment but it's not working at the moment. A workaround is to install ESMF/ESMPY from scratch. I've done it several time and my scripts are on gist.github
@lgloege update, this works:
conda create -n test_esmpy python=3.7
source activate test_esmpy
conda install esmpy
conda install xesmf
git clone https://github.com/JiaweiZhuang/xESMF.git xESMF
cd xESMF/
py.test -v xesmf
@jbusecke I had the same problem you described and narrowed it down to MPI not initializing correctly. It turns out I needed to add my computer's name to /etc/hosts/
, perhaps the mismatch between that file and what hostname
was returning was the issue.
Adding the last line below worked for me:
##
# Host Database
#
# localhost is used to configure the loopback interface
# when the system is booting. Do not change this entry.
##
127.0.0.1 localhost
127.0.0.1 <computer's name>
Has someone resolved this for good?
This is what i get after running pytest as @raphaeldussin did. It seems an ESMF regrid related issue.
`xesmf/tests/test_backend.py::test_warn_f_on_array PASSED [ 4%] xesmf/tests/test_backend.py::test_warn_f_on_grid PASSED [ 8%] xesmf/tests/test_backend.py::test_warn_lat_range PASSED [ 12%] xesmf/tests/test_backend.py::test_esmf_grid_with_corner PASSED [ 16%] xesmf/tests/test_backend.py::test_esmf_build_bilinear PASSED [ 20%] xesmf/tests/test_backend.py::test_regrid Fatal Python error: Segmentation fault
Current thread 0x00007f26cea4e740 (most recent call first): File "/projects/home/iff/.local/lib/anaconda3/envs/test_esmpy/lib/python3.7/site-packages/ESMF/interface/cbindings.py", line 2145 in ESMP_FieldRegridStoreFile`
@iacopoff What's the exact command you use to install the packages? What OS you are on? Could you reproduce the error by a Dockerfile?
@JiaweiZhuang, I will use Docker as soon as I learn how to use it!
I have used those
conda create -n test_esmpy python=3.7
source activate test_esmpy
conda install esmpy
conda install xesmf
git clone https://github.com/JiaweiZhuang/xESMF.git xESMF
cd xESMF/
py.test -v xesmf
and my OS and some other info:
Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 12 On-line CPU(s) list: 0-11 Thread(s) per core: 2 Core(s) per socket: 6 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 44 Model name: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz Stepping: 2 CPU MHz: 2393.983 BogoMIPS: 4787.96 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 12288K NUMA node0 CPU(s): 0-11
@lgloege Use the conda-forge channel conda install -c conda-forge xesmf
Hi, this works in a fresh environment:
conda install -c conda-forge esmpy conda install -c conda-forge xesmf
It works also on on an environment where you have xarray and dask already installed but make sure to have a version that matches the requirements of esmpy and xesmf.
Hi - has this been resolved?
I also get my kernel dying at the xe.Regridder()
step. I have started a fresh environment using the esmpy
and xesmf
installations with the conda-forge
channel, and the output of the pytest -v xesmf
command is:
=================================================================== test session starts ===================================================================
platform darwin -- Python 3.7.6, pytest-5.4.2, py-1.8.1, pluggy-0.13.1 -- /Users/jessicaluo/miniconda3/envs/test_esmpy/bin/python
cachedir: .pytest_cache
rootdir: /Users/jessicaluo/Desktop/xESMF
collected 42 items
xesmf/tests/test_backend.py::test_warn_f_on_array PASSED [ 2%]
xesmf/tests/test_backend.py::test_warn_f_on_grid PASSED [ 4%]
xesmf/tests/test_backend.py::test_warn_lat_range PASSED [ 7%]
xesmf/tests/test_backend.py::test_esmf_grid_with_corner PASSED [ 9%]
xesmf/tests/test_backend.py::test_esmf_build_bilinear PASSED [ 11%]
xesmf/tests/test_backend.py::test_regrid Fatal Python error: Illegal instruction
Current thread 0x00007fff97846380 (most recent call first):
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/ESMF/interface/cbindings.py", line 2174 in ESMP_FieldRegridStoreFile
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/ESMF/util/decorators.py", line 52 in new_func
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/ESMF/api/regrid.py", line 151 in __init__
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/ESMF/util/decorators.py", line 64 in new_func
File "/Users/jessicaluo/Desktop/xESMF/xesmf/backend.py", line 280 in esmf_regrid_build
File "/Users/jessicaluo/Desktop/xESMF/xesmf/tests/test_backend.py", line 139 in test_regrid
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/python.py", line 182 in pytest_pyfunc_call
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/python.py", line 1477 in runtest
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/runner.py", line 135 in pytest_runtest_call
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/runner.py", line 217 in <lambda>
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/runner.py", line 244 in from_call
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/runner.py", line 217 in call_runtest_hook
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/runner.py", line 186 in call_and_report
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/runner.py", line 100 in runtestprotocol
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/runner.py", line 85 in pytest_runtest_protocol
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/main.py", line 272 in pytest_runtestloop
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/main.py", line 247 in _main
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/main.py", line 191 in wrap_session
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/main.py", line 240 in pytest_cmdline_main
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/callers.py", line 187 in _multicall
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 87 in <lambda>
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/manager.py", line 93 in _hookexec
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/pluggy/hooks.py", line 286 in __call__
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/lib/python3.7/site-packages/_pytest/config/__init__.py", line 125 in main
File "/Users/jessicaluo/miniconda3/envs/test_esmpy/bin/pytest", line 11 in <module>
Illegal instruction: 4
@jessluo can you post what are the version of esmf, esmpy, xesmf you're using?
Not sure if it's related but I had issues recently that I solved using the mpich build of esmf. Maybe try to install those specific builds:
- esmf=8.0.0=mpi_mpich_hd6ca8f3_103
- esmpy=8.0.0=mpi_mpich_py36ha9b28fa_101
I am having the same issue as @jessluo; using:
esmf=8.0.0=mpi_mpich_h31c0ad6_107 esmpy=8.0.0=mpi_mpich_py36ha9b28fa_101 xesmf=0.3.0=py_0
Packages are built in a clean environment using conda-forge
@KellyNthn are you on mac or linux?
I am on MAC; OS X 10.13
I've been experiencing some issues on my mac as well. Not sure what's the problem right now. My Linux seems to be working ok, just installed a brand new env and all tests worked
Im having the same issues with a mac. so far i have tried v= 8.1.0 mpi and openmpi without success
Hi, I see this in an old issue, but is not closed, so will give it a try.
My kernel dies specifically when I use periodic=True. I have tried to run
regridder = xe.Regridder(ds1, ds2, method='bilinear', filename=weights, periodic=True)
with both ds1 and ds2 as dask arrays, and with ds1 and ds2 as loaded arrays (i.e. ds1.isel(time=0).load()). The kernel crashes in both cases.
With dask arrays I was expecting it to be a lazy operation, so I'm not sure why "something" is happening, and why is there a huge difference when using periodic (i understand that some interpolation happens in the periodic case, but it is a lot):
%%time
test = xe.Regridder(ds1, ds2, method='bilinear', filename='testw.nc')
CPU times: user 31.4 s, sys: 239 ms, total: 31.7 s
Wall time: 31.9 s
vs
%%time
test = xe.Regridder(ds1, ds2, method='bilinear', filename='testw.nc', periodic=True)
CPU times: user 5min 19s, sys: 15.9 s, total: 5min 35s
Wall time: 5min 35s
I can "fix" this by using more memory (i.e. requesting the whole node), but is there something I am missing to make this a dask computation?
EDIT: Whoops! I realized this issue is about a tutorial example, and I'm not doing exactly that, but this is the only issue I found about kernel crashing on regridder. I can make it its own thing if needed.
I am running the tutorial at https://xesmf.readthedocs.io/en/latest/Curvilinear_grid.html and the kernel dies. I updated all packages and still have issues. Does anybody have any idea where the incompatibility is coming from? here is the environment I am running in:
pip packages