PCMDI / pcmdi_metrics

Open-source Python package for Systematic Evaluation of Climate and Earth System Models
http://pcmdi.github.io/pcmdi_metrics/
BSD 3-Clause "New" or "Revised" License
102 stars 39 forks source link

[Bug]: Assertion Error in create_land_sea_mask() #1154

Open acordonez opened 1 month ago

acordonez commented 1 month ago

What happened?

I called create_land_sea_mask() with method="pcmd". This error does not happen when the default method is used.

sft = create_land_sea_mask(ds,method="pcmdi")

What did you expect to happen? Are there are possible answers you came across?

I expected this to return a land/sea mask generated with the pcmdi method.

Minimal Complete Verifiable Example (MVCE)

import xcdat
from pcmdi_metrics.utils import create_land_sea_mask

f="/global/cfs/projectdirs/m3522/cmip6/LOCA2/GFDL-CM4/0p0625deg/r1i1p1f1/historical/tasmax/tasmax.GFDL-CM4.historical.r1i1p1f1.1950-2014.LOCA_16thdeg_v20220413.nc"

ds=xcdat.open_dataset(f).sel({"time":slice("1981-01-01","1981-01-31")}) 
# Selecting a smaller amount of data for example

sft = create_land_sea_mask(ds,method="pcmdi")

Relevant log output

---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
Cell In[4], line 5
      3 start_year = 1981
      4 end_year = 1982
----> 5 sft = create_land_sea_mask(ds,method="pcmdi")
      6 sftlf = ds.copy(data=None)
      7 sftlf["sftlf"] = sft

File ~/miniconda3/envs/pmp_drcdm/lib/python3.10/site-packages/pcmdi_metrics/utils/land_sea_mask.py:78, in create_land_sea_mask(obj, lon_key, lat_key, as_boolean, method)
     74         land_sea_mask = xr.where(land_sea_mask, 0, 1)
     76 elif method.lower() == "pcmdi":
     77     # Use the PCMDI method developed by Taylor and Doutriaux (2000)
---> 78     land_sea_mask = generate_land_sea_mask__pcmdi(obj)
     80     if as_boolean:
     81         # Convert the 1/0 land-sea mask to a boolean mask
     82         land_sea_mask = land_sea_mask.astype(bool)

File ~/miniconda3/envs/pmp_drcdm/lib/python3.10/site-packages/pcmdi_metrics/utils/land_sea_mask.py:360, in generate_land_sea_mask__pcmdi(target_grid, source, data_var, maskname, regridTool, threshold_1, threshold_2, debug)
    318 """Generates a best guess mask on any rectilinear grid, using the method described in `PCMDI's report #58`_
    319 
    320 Parameters
   (...)
    356 2023-06 The [original code](https://github.com/CDAT/cdutil/blob/master/cdutil/create_landsea_mask.py) was rewritten using xarray and xcdat by Jiwoo Lee
    357 """
    359 if source is None:
--> 360     egg_pth = resources.resource_path()
    361     source_path = os.path.join(egg_pth, "navy_land.nc")
    362     if not os.path.isfile(source_path):
    363         # pip install process places data files in different place, so checking here as well

File ~/miniconda3/envs/pmp_drcdm/lib/python3.10/site-packages/pcmdi_metrics/resources.py:24, in resource_path()
     21     res_path = os.path.join(os.getcwd(), "share", "pmp")
     23 # Should never fail this
---> 24 assert os.path.exists(res_path)
     26 return res_path

AssertionError:

Anything else we need to know?

No response

Environment

I am working in a jupyter notebook on NERSC, using the kernel installed from my branch 1098_ao_drcdm

lee1043 commented 1 month ago

It looks like the code is not finding navy_land.nc file that is originally in share/data, which is in setup.py to enable this data after installation. Not sure why...

acordonez commented 1 month ago

@lee1043 Just adding that I've tested this in a fresh environment, still installing my branch 1098_ao_drcdm. I'm seeing the same error.

acordonez commented 1 month ago

I looked around in my miniconda folder and confirmed that miniconda3/envs/pmp_test/share/pmp/navy_land.nc exists and is a valid netcdf file.

lee1043 commented 1 month ago

I created a fresh env on my Mac but couldn't reproduce the error.

conda env -n pmp_v3.6.1 -c conda-forge pcmdi_metrics

from pcmdi_metrics.utils import create_target_grid, create_land_sea_mask
grid = create_target_grid(-90, 90, 0, 360, target_grid_resolution="5x5")
mask = create_land_sea_mask(grid)  
mask2 = create_land_sea_mask(grid, method='pcmdi')  

This sample code worked well.

I will do some more tests to narrow down.

@acordonez is the error occurs when install from your branch only or from main branch too in your env?

lee1043 commented 1 month ago

The minimal code in the above comment works well in fresh env on NERSC Perlmutter. I however found the following error from the original example code in the first comment. I think that is maybe because the method="pcmdi" was designed/tested for gloabl data while the data loaded from the code was LOCA2 regional data.

import xcdat
from pcmdi_metrics.utils import create_land_sea_mask

f="/global/cfs/projectdirs/m3522/cmip6/LOCA2/GFDL-CM4/0p0625deg/r1i1p1f1/historical/tasmax/tasmax.GFDL-CM4.historical.r1i1p1f1.1950-2014.LOCA_16thdeg_v20220413.nc"

ds=xcdat.open_dataset(f).sel({"time":slice("1981-01-01","1981-01-31")}) 
# Selecting a smaller amount of data for example

sft = create_land_sea_mask(ds,method="pcmdi")
>>> sft = create_land_sea_mask(ds,method="pcmdi")
Traceback (most recent call last):
  File "/global/homes/l/lee1043/.conda/envs/pmp_20241001_v3.6.1/lib/python3.10/site-packages/xcdat/regridder/regrid2.py", line 557, in _get_bounds_ensure_dtype
    name = ds.cf.bounds[axis][0]
KeyError: 'Y'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/global/homes/l/lee1043/.conda/envs/pmp_20241001_v3.6.1/lib/python3.10/site-packages/pcmdi_metrics/utils/land_sea_mask.py", line 78, in create_land_sea_mask
    land_sea_mask = generate_land_sea_mask__pcmdi(obj)
  File "/global/homes/l/lee1043/.conda/envs/pmp_20241001_v3.6.1/lib/python3.10/site-packages/pcmdi_metrics/utils/land_sea_mask.py", line 382, in generate_land_sea_mask__pcmdi
    ds_regrid = ds.regridder.horizontal(data_var, target_grid, tool=regridTool)
  File "/global/homes/l/lee1043/.conda/envs/pmp_20241001_v3.6.1/lib/python3.10/site-packages/xcdat/regridder/accessor.py", line 205, in horizontal
    output_ds = regridder.horizontal(data_var, self._ds)
  File "/global/homes/l/lee1043/.conda/envs/pmp_20241001_v3.6.1/lib/python3.10/site-packages/xcdat/regridder/regrid2.py", line 74, in horizontal
    dst_lat_bnds = _get_bounds_ensure_dtype(self._output_grid, "Y")
  File "/global/homes/l/lee1043/.conda/envs/pmp_20241001_v3.6.1/lib/python3.10/site-packages/xcdat/regridder/regrid2.py", line 559, in _get_bounds_ensure_dtype
    raise RuntimeError(f"Could not determine {axis!r} bounds")
RuntimeError: Could not determine 'Y' bounds
lee1043 commented 1 month ago

Below code is working for me on my NERSC fresh PMP env, although it still does not address the original issue.

import xcdat
from pcmdi_metrics.utils import create_land_sea_mask

f="/global/cfs/projectdirs/m3522/cmip6/LOCA2/GFDL-CM4/0p0625deg/r1i1p1f1/historical/tasmax/tasmax.GFDL-CM4.historical.r1i1p1f1.1950-2014.LOCA_16thdeg_v20220413.nc"

ds=xcdat.open_dataset(f).sel({"time":slice("1981-01-01","1981-01-31")}) 

ds.lat.attrs['axis'] = 'Y'
ds.lon.attrs['axis'] = 'X'

sft = create_land_sea_mask(ds,method="pcmdi")
acordonez commented 1 month ago

@lee1043 Thanks for looking into this! I'll see what happens if I pip install from the main branch, and try the decision relevant code in a jupyter notebook.