geoschem / gcpy

Python toolkit for GEOS-Chem. Contains basic plotting scripts, plus the suite of GEOS-Chem benchmarking utilities.
https://gcpy.readthedocs.io
Other
51 stars 24 forks source link

GCPy not properly recognizing 0.125x0.15625 lat-lon grid #314

Open msulprizio opened 6 months ago

msulprizio commented 6 months ago

Name and Institution (Required)

Name: Melissa Sulprizio Institution: Harvard

Description of your issue or question

When trying to create plots comparing 12-km (0.125x0.15625 degree) lat-lon grid to other resolutions, I am getting a similar error as reported in #313.

ValueError: operands could not be broadcast together with shapes (402,447) (402,416)

I am able to plot the 0.125x0.15625 output against itself, but I noticed the resolution string is incorrect. For example:

Screenshot 2024-04-18 at 3 25 44 PM

I added debug statements in grid.py for the longitude: [-129.984 -129.8277 -129.6714... -60.5868 -60.4305 -60.2742]

which returns the longitude resolution of 0.15629999999998745

I suspect there's an issue with numerical roundoff happening and we haven't encountered it because previous resolution strings haven't extended beyond 4 decimal places.

yantosca commented 5 months ago

@msulprizio: I will look into this. It may be an easy fix like you said, due to roundoff.

yantosca commented 5 months ago

@msulprizio: I believe the issue is in the get_input_res function of gcpy/grid.py. This is used to get the lon & lat resolution from the input xr.Dataset or xr.DataArray to the plotting codes.

It could be that the roundoff is happening in the netCDF files themselves. I tested the algorithm in get_input_res with this script:

#!/usr/bin/env python3

import numpy as np
from gcpy.grid import make_grid_LL

grid_12km = make_grid_LL("0.125x0.15625")

grid_12km["lat"].sort()
grid_12km["lon"].sort()
lat_res = np.abs(grid_12km["lat"][2] - grid_12km["lat"][1])
lon_res = np.abs(grid_12km["lon"][2] - grid_12km["lon"][1])

res = str(lat_res) + "x" + str(lon_res)

print(lon_res)
print(lat_res)
print(res)

and I got this output

0.15625
0.125
0.125x0.15625

So I think the algorithm is OK. Maybe the the longitude and latitude values in the netCDF file have some roundoff (if they are REAL*4) and this could be causing it. We could put in a workaround to force the correct resolution string.

yantosca commented 5 months ago

@msulprizio: Has this been resolved?

msulprizio commented 4 months ago

I have not confirmed that this is resolved now. I will report back when I get the 0.125x0.15625 transport tracer runs past a wetdep error.

msulprizio commented 2 months ago

This is still an issue. I am comparing 0.125x0.15625 output over North America to 0.25x0.3125 using the compare_diags.py. script and this compare_diags.yml as input. I am still getting the error:

Using configuration file compare_diags.yml                                                                                                                                                                                    

Comparing variable names in compare_varnames                                                                                                                                                                                  
32 common variables                                                                                                                                                     
0 variables in ref only                                                                                                                                                                 
0 variables in dev only                                                                                                                                                                 
All variables have same dimensions in ref and dev                                                                                                                                                                             
... Creating single-level plots                                                                                                                                                                                         
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/externals/loky/process_executor.py", line 463, in _process_worker
    r = call_item()
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/externals/loky/process_executor.py", line 291, in __call__
    return self.fn(*self.args, **self.kwargs)
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/parallel.py", line 589, in __call__
    return [func(*args, **kwargs)
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/parallel.py", line 589, in <listcomp>
    return [func(*args, **kwargs)
  File "/n/home05/msulprizio/python/gcpy/gcpy/plot/compare_single_level.py", line 794, in createfig
    absdiff = np.array(ds_dev_cmp) - np.array(ds_ref_cmp)
ValueError: operands could not be broadcast together with shapes (402,448) (401,448) 
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/n/holylfs05/LABS/jacob_lab/Users/msulprizio/Runs/12km_validation/test_1day/./compare_diags.py", line 367, in <module>
    main(sys.argv)
  File "/n/holylfs05/LABS/jacob_lab/Users/msulprizio/Runs/12km_validation/test_1day/./compare_diags.py", line 362, in main
    compare_data(config, read_data(config))
  File "/n/holylfs05/LABS/jacob_lab/Users/msulprizio/Runs/12km_validation/test_1day/./compare_diags.py", line 292, in compare_data
    compare_single_level(
  File "/n/home05/msulprizio/python/gcpy/gcpy/plot/compare_single_level.py", line 1150, in compare_single_level
    results = Parallel(n_jobs=n_job)(
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/parallel.py", line 1952, in __call__
    return output if self.return_generator else list(output)
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/parallel.py", line 1595, in _get_outputs
    yield from self._retrieve()
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/parallel.py", line 1699, in _retrieve
    self._raise_error_fast()
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/parallel.py", line 1734, in _raise_error_fast
    error_job.get_result(self.timeout)
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/parallel.py", line 736, in get_result
    return self._return_or_raise()
  File "/n/home05/msulprizio/python/micromamba/envs/gcpy_env/lib/python3.9/site-packages/joblib/parallel.py", line 754, in _return_or_raise
    raise self._result
ValueError: operands could not be broadcast together with shapes (402,448) (401,448)
yantosca commented 2 months ago

@msulprizio: I wonder if this is a halfpolar grid thing. I think the nested-grid output doesn't use halfpolar=true (or something like that).

github-actions[bot] commented 3 weeks ago

This issue has been automatically marked as stale because it has not had recent activity. If there are no updates within 7 days it will be closed. You can add the "never stale" tag to prevent the issue from closing this issue.

msulprizio commented 3 weeks ago

This may be addressed by fixes currently in the branch bugfix/add-extent-to-compare-diags. Tagging @yantosca.

yantosca commented 3 weeks ago

Hi @msulprizio. It seems that the modifications in the bugfix/add-extent-to-compare-diags branch work OK for single panel plots but not for zonal mean plots. Using some test data I get:

(test_env) [holy8a24401 tt_run]$ python -m gcpy.examples.diagnostics.compare_diags tt.yml 
Using configuration file tt.yml
... Creating zonal mean plots
---------------------------------------
In six_plot.py
extent: [-180, 180, 9.6875, 59.875]
---------------------------------------
In single_panel.py
plot_vals shape  : (72, 201)
grid lat   shape : (201,)
grid lat_b shape : (202,)
extent           : [-180, 180, 9.6875, 59.875]
---------------------------------------
In six_plot.py
extent: [-180, 180, 9.6875, 59.875]
---------------------------------------
In single_panel.py
plot_vals shape  : (72, 402)
grid lat   shape : (401,)
grid lat_b shape : (402,)
extent           : [-180, 180, 9.6875, 59.875]

The problem is that the plot_vals (which is the array read from the file) has 402 latitudes. But the grid object (which is computed by gcpy.grid.get_grid_extents is only returning 401 latitudes. For pcolormesh you need to supply the edges of the grid points and thus that number has to be one less than the number of the points in the data array in each dimension. This is only a problem on the 0.125 x 0.15625 grid. Still digging further...

Also note the extent value is being passed from the configuration file.

yantosca commented 3 weeks ago

@msulprizio @lizziel: Upon further investigation I've traced this issue to these lines in create_regridders in gcpy/regrid.py:

    # ==================================================================
    # Make grids (ref, dev, and comparison)
    # ==================================================================
    [refgrid, _] = call_make_grid(
        refres, refgridtype, ref_extent, cmp_extent, sg_ref_params)

    [devgrid, _] = call_make_grid(
        devres, devgridtype, dev_extent, cmp_extent, sg_dev_params)

    [cmpgrid, _] = call_make_grid(
        cmpres, cmpgridtype, cmp_extent, cmp_extent, sg_cmp_params)

I printed out some outputs:

---------------------------------
In create_regridders
ref_extent : [-130.0, -60.0, 9.75, 59.75]  # 0.25 x 0.3125 nested
cmp_extent : [-130.0, -60.0, 9.75, 59.75]  # 0.5 x 0.5 comparison grid
dev_extent : [-130.0, -60.0, 9.75, 59.875] # 0.125 x 0.15625 nested
------------------------------------
In make_grid_ll
dlat          : 0.125
lat_b min/max : 9.6875 59.9375
lat   min/max : 9.75   59.875

In make_grid_ll, after trim bounds
lat_b min/max : 9.6875 59.8125
lat   min/max : 9.75   59.75

So before the end of routine make_grid_LL (which creates the grid metadata for lat-lon grids), the maximum longitude at 0.125 is truncated. It's because the max latitude of the 0.125 (the last value of dev_extent) is greater than the max latitude of the comparison grid (the last value of cmp_extent). This causes the array of latitude edges (lat_b) to be one less than what it should be. I will see if I can implement a fix.