UXARRAY / uxarray

Xarray extension for unstructured climate and global weather data analysis and visualization.
https://uxarray.readthedocs.io/
Apache License 2.0
151 stars 31 forks source link

Non-Conservative Zonal Mean #785

Open philipc2 opened 4 months ago

philipc2 commented 4 months ago

Closes #93

Overview

grid_path = "/path/to/grid.nc" data_path = "/path/to/data.nc"

uxds = ux.open_dataset(grid_path, data_path)

compute the zonal average for the default start_lat_deg=-90, end_lat_deg=90, step_deg=5

zonal_result = uxds['data_var'].zonal_mean()

Compute the zonal mean for a single latitude:

zonal_result = uxds["t2m"].zonal_mean(30.0)



## PR Checklist
<!-- Please mark any checkboxes that do not apply to this PR as [N/A]. If an entire section doesn't
apply to this PR, comment it out or delete it. -->

**General**
- [x] An issue is linked created and linked
- [x] Add appropriate labels
- [x] Filled out Overview and Expected Usage (if applicable) sections

**Testing**
- [x] Adequate tests are created if there is new functionality
- [x] Tests cover all possible logical paths in your function
- [x] Tests are not too basic (such as simply calling a function and nothing else)

**Documentation**
- [x] Docstrings have been added to all new functions
- [x] Docstrings have updated with any function changes
- [x] Internal functions have a preceding underscore (`_`) and have been added to `docs/internal_api/index.rst`
- [x] User functions have been added to `docs/user_api/index.rst`

**Examples**
- [x] Any new notebook examples added to `docs/examples/` folder
- [x] Clear the output of all cells before committing
- [x] New notebook files added to `docs/examples.rst` toctree
- [ ] New notebook files added to new entry in `docs/gallery.yml` with appropriate thumbnail photo in `docs/_static/thumbnails/`
philipc2 commented 4 months ago

Hi @hongyuchen1030

I've set up the initial boilerplate for your implementation. If you'd like to discuss anything about the implementation before getting started, we can chat about it here.

One question though: will this implementation be the conservative or non-conservative one?

hongyuchen1030 commented 4 months ago

Hi @hongyuchen1030

I've set up the initial boilerplate for your implementation. If you'd like to discuss anything about the implementation before getting started, we can chat about it here.

One question though: will this implementation be the conservative or non-conservative one?

Thank you very much for your great help! I plan to implement the non-conservative one.

And can you also fill up the expected usage for me please? so that I have a better idea to start? Thanks

philipc2 commented 4 months ago

Hi @hongyuchen1030 I've set up the initial boilerplate for your implementation. If you'd like to discuss anything about the implementation before getting started, we can chat about it here. One question though: will this implementation be the conservative or non-conservative one?

Thank you very much for your great help! I plan to implement the non-conservative one.

And can you also fill up the expected usage for me please? so that I have a better idea to start? Thanks

Should be good now.

I wanted to ask and see how you think this type of functionality will work on data variables mapped to different elements. For example, we want to support data mapped to:

Does your algorithm work on all these of these?

hongyuchen1030 commented 4 months ago

Hi @hongyuchen1030 I've set up the initial boilerplate for your implementation. If you'd like to discuss anything about the implementation before getting started, we can chat about it here. One question though: will this implementation be the conservative or non-conservative one?

Thank you very much for your great help! I plan to implement the non-conservative one. And can you also fill up the expected usage for me please? so that I have a better idea to start? Thanks

Should be good now.

I wanted to ask and see how you think this type of functionality will work on data variables mapped to different elements. For example, we want to support data mapped to:

  • Nodes
  • Edges
  • Faces

Does your algorithm work on all these of these?

A non-conservative zonal average is an average value on a constant latitude. So I will say it might be mapped to the faces on that latitude.

In your example function call, where is the place that users input latitude for query? And how do users know it's a non-conservative zonal average?

Such query should looks like res = grid.nc_zonal_mean(60, "Temperature")#60 degree for temperature mean

philipc2 commented 4 months ago

I may have worded the first part of my question incorrectly. I was asking whether your zonal average algorithm can compute the zonal average of data variables that either node, edge, or face centered. Typically the data variables are mapped to the faces, but less commonly data can be stored on each node or edge. For this application thought, I think it should only apply to edge or face centered data, since computing the zonal average of data on each node doesn't make much spatial sense.

For the second part, I can update it to reflect that (I put a very bare-bone outline)

Below is a zonal-average figure, with the righthand plot being the zonal average across the latitudes.

image

I think our UxDataArray.zonal_average()should compute the zonal average across a discrete range of lattitudes, something like -90 to 90 with a step size of 5 for example. This would generate a plot like the one above.

The example you gave would be computing the zonal average at a specific latitude. This should be a helper function under uxarray.core.zonal and can be called:

# helper function
_zonal_average_at_lattitude(data: np.ndarray, lat: float, conservative: bool = False)

This would contain the bulk of the artihmatic.

This would allow the one implemented at for the UxDataArray to call the helper depending on the user's parameters

# default, returns the zonal average for latitudes in the range [-90, 90] with a step size of 5
uxds['t2m'].zonal_average()

# specify range and step size
uxds['t2m'].zonal_average(lat = (-45, 45, 5))

# compute the zonal average for a single longitude
uxds['t2m'].zonal_average(lat = 45)

# parameter for conservative
uxds['t2m'].zonal_average(conservative=False)

Just a thought, we may want some Grid helper functions for getting the indices of the edges or faces that intersect a latitude line.

philipc2 commented 4 months ago

Here's a reference from NCL's zonal average page (which is where I got the figure above from) https://www.ncl.ucar.edu/Applications/zonal.shtml

hongyuchen1030 commented 4 months ago

I may have worded the first part of my question incorrectly. I was asking whether your zonal average algorithm can compute the zonal average of data variables that either node, edge, or face centered. Typically the data variables are mapped to the faces, but less commonly data can be stored on each node or edge. For this application thought, I think it should only apply to edge or face centered data, since computing the zonal average of data on each node doesn't make much spatial sense.

For the second part, I can update it to reflect that (I put a very bare-bone outline)

Below is a zonal-average figure, with the righthand plot being the zonal average across the latitudes.

image

I think our UxDataArray.zonal_average()should compute the zonal average across a discrete range of lattitudes, something like -90 to 90 with a step size of 5 for example. This would generate a plot like the one above.

The example you gave would be computing the zonal average at a specific latitude. This should be a helper function under uxarray.core.zonal and can be called:

# helper function
_zonal_average_at_lattitude(lat: float, conservative: bool = False)

This would contain the bulk of the artihmatic.

This would allow the one implemented at for the UxDataArray to call the helper depending on the user's parameters

# default, returns the zonal average for latitudes in the range [-90, 90] with a step size of 5
uxds['t2m'].zonal_average()

# specify range and step size
uxds['t2m'].zonal_average(lat = (-45, 45, 5))

# compute the zonal average for a single longitude
uxds['t2m'].zonal_average(lat = 45)

# parameter for conservative
uxds['t2m'].zonal_average(conservative=False)

Just a thought, we may want some Grid helper functions for getting the indices of the edges or faces that intersect a latitude line.

Thank you for your information. I was told that this function will be averaging the data for each face. So I only implemented for the scenario where: a data is attached to each face.

So I expected this current branch to be for a single zonal_average only, the helper function you mentioned. Once I finished this helper function, you can feel free to implement the discrete values function by calling my function.

philipc2 commented 4 months ago

Thank you for your information. I was told that this function will be averaging the data for each face. So I only implemented for the scenario where: a data is attached to each face.

Got it! Face-centered data is by far the most common scenario anyone will run into, so no worries.

So I expected this current branch to be for a single zonal_average only, the helper function you mentioned. Once I finished this helper function, you can feel free to implement the discrete values function by calling my function.

Sounds good, I can handle the UxDataArray implementation in this PR once you finish up the helper function.

I'll get the boilerplate updated with what we just discussed.

hongyuchen1030 commented 4 months ago

Thank you for your information. I was told that this function will be averaging the data for each face. So I only implemented for the scenario where: a data is attached to each face.

Got it! Face-centered data is by far the most common scenario anyone will run into, so no worries.

So I expected this current branch to be for a single zonal_average only, the helper function you mentioned. Once I finished this helper function, you can feel free to implement the discrete values function by calling my function.

Sounds good, I can handle the UxDataArray implementation in this PR once you finish up the helper function.

I'll get the boilerplate updated with what we just discussed.

Great! Thank you very much for your help!

philipc2 commented 4 months ago

Thank you for your information. I was told that this function will be averaging the data for each face. So I only implemented for the scenario where: a data is attached to each face.

Got it! Face-centered data is by far the most common scenario anyone will run into, so no worries.

So I expected this current branch to be for a single zonal_average only, the helper function you mentioned. Once I finished this helper function, you can feel free to implement the discrete values function by calling my function.

Sounds good, I can handle the UxDataArray implementation in this PR once you finish up the helper function. I'll get the boilerplate updated with what we just discussed.

Great! Thank you very much for your help!

Always happy to help!

review-notebook-app[bot] commented 4 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

philipc2 commented 4 months ago

@hongyuchen1030

As part of this PR, I'll be putting together a user-guide section too.

amberchen122 commented 4 months ago

Hi @philipc2,

I have a question regarding the expected usage of the zonal_mean in the PR description:

zonal_mean is a member function of UxDataArray class,

However, zonal_mean is accessed as UxDataset.zonal_mean in the PR description.

I checked api.py and noticed that it does not have a function similar to open_dataset that returns the UxDataArray class.

Could you please clarify if UxDataArray and UxDataset are both expected to be returned by open_dataset function in the future?

Thank you! --Amber

philipc2 commented 4 months ago

Hi @philipc2,

I have a question regarding the expected usage of the zonal_mean in the PR description:

zonal_mean is a member function of UxDataArray class,

However, zonal_mean is accessed as UxDataset.zonal_mean in the PR description.

I checked api.py and noticed that it does not have a function similar to open_dataset that returns the UxDataArray class.

Could you please clarify if UxDataArray and UxDataset are both expected to be returned by open_dataset function in the future?

Thank you! --Amber

Hi @amberchen122

uxds is an instance of a UxDataset, however once we access the data variable of interested, in our case uxds['data_var'], we are now working with aUxDataArray`.

You can think of a UxDataset as a collection of UxDataArrays

I'd suggest giving this section in our user guide a read, it should hopefully clarify it!

Let me know if you have any other questions!

amberchen122 commented 4 months ago

Hi team,

Thank you for your help! :) --Amber

philipc2 commented 4 months ago

I attempted to run this on one of our sample meshes, but received an error.

base_path = "../../test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)
uxds['psi'].zonal_mean()

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

philipc2 commented 4 months ago

Hi team,

  • I have completed the implementation of the zonal-mean function and its helper functions in zonal.py. I am now proceeding with testing and documentation. @philipc2, could you please provide any general guidelines for testing and documentation that I should follow?
  • I have assumed that the zonal-mean function handles only face-centered data. Could you confirm if this is correct? @philipc2 @hongyuchen1030 Link to code
  • When a constLat does not intersect any face, I have returned INT_FILL_VALUE as a placeholder value. If there is a preferred placeholder value, please let me know. Link to code
  • Additionally, regarding the zonal-mean API, should we allow users to specify the latitude range for which they want to compute the zonal mean?

Thank you for your help! :) --Amber

Thank you so much for putting this together and helping out Amber!

  1. You can refer to our contributors guide some details, however some of it might be a little outdated.
  2. Yes, having it work for face-centered data only right now is okay
  3. If there is no intersections found for a single lattitude zonal average, we should have it raise an exception, but if the user is performing the zonal average over a range, any lattitude that doesnt have any intersetions should have its value set to np.nan. For example, if we perform the zonal average of lattitudes of [-10, 0, 10] and 0 has no intersections, the result should look something like [0.13, np.nan, 0.33]
  4. See my comment above
philipc2 commented 4 months ago

For documentation, I have already set up a notebook for a user guide section that you can fill out.

uxarray/docs/user-guide/zonal-mean.ipynb

You should also include any functions in both our User and Internal API under uxarray/docs/internal_api.index.rst and uxarray/docs/user_api.index.rst

philipc2 commented 4 months ago

You can use one of my work in progress sections as a reference if you'd like.

https://uxarray--711.org.readthedocs.build/en/711/user-guide/topological-aggregations.html

philipc2 commented 4 months ago

I attempted to run this on one of our sample meshes, but received an error.

base_path = "../../test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)
uxds['psi'].zonal_mean()

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

There seems to be an error in the bounds computation.

Here is a single face from that mesh:

image

These are the bounds generated by uxgrid.bounds

image

Compare this to what is generated with GeoPandas:

image

Any Thoughts? It looks to be offset by 360, with everything else being correct.

Example to Reproduce

import uxarray as ux
import numpy as np
import holoviews as hv

base_path = "../uxarray/test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)

gdf = uxds.uxgrid.to_geodataframe()
ggdf = gdf.to_geopandas()

first_face_bound_ux = uxds.uxgrid.bounds.values[0]
np.rad2deg(first_face_bound_ux)

first_face_bound_gp = ggdf.bounds[0:1]

hv.Polygons(ggdf[0:1])
hongyuchen1030 commented 4 months ago

I attempted to run this on one of our sample meshes, but received an error.

base_path = "../../test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)
uxds['psi'].zonal_mean()

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

There seems to be an error in the bounds computation.

Here is a single face from that mesh: image

These are the bounds generated by uxgrid.bounds image

Compare this to what is generated with GeoPandas:

image

Any Thoughts? It looks to be offset by 360, with everything else being correct.

Example to Reproduce

import uxarray as ux
import numpy as np
import holoviews as hv

base_path = "../uxarray/test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)

gdf = uxds.uxgrid.to_geodataframe()
ggdf = gdf.to_geopandas()

first_face_bound_ux = uxds.uxgrid.bounds.values[0]
np.rad2deg(first_face_bound_ux)

first_face_bound_gp = ggdf.bounds[0:1]

hv.Polygons(ggdf[0:1])

Are these latlon faces? Latlon faces means all edges are either latitude line or longitude lines.

In the bound function, if you didn't pass grid._populate_bounds(is_latlon = True) it will treat every edge as a great circle arc. And great circle arcs will arc up: The max and min latitude is not at the endpoint

There are two test helpers you can utilize to see if that's the problem The following function is a little helper function: it uses an iterative method to find the maximum latitude as the ground true. https://github.com/UXARRAY/uxarray/blob/386e92df3e3a3d436d325c2b7a6ad5f3560c8662/test/test_geometry.py#L166

But to me, the only thing that is off is latitude max:-32.48416(our) and their results:-32.484165. Which seems a lot like a precision issue here.

hongyuchen1030 commented 4 months ago

There used to be several test cases in which I ran your mentioned mesh files, and it produced the same bounds as my iterative method(guaranteed to be correct). However, it seems they're removed now for some reason.

And for the longitude range. The inconsistency in the longitude expression might cause issues, as what happened in the previous coordinates conversion vectorization PR.

amberchen122 commented 3 months ago

Updated Documentation and Tests:

I am also encountering a ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results error in the tests. Due to this issue, I am currently bypassing the _get_zonal_faces_weight_at_constLat function, as it seems to be the source of the problem.

Should I go ahead and try to figure out why build_latlon_box is generating incorrect bounds?

Besides that, all the internal APIs seem to be working correctly; I will implement an integrating test after the bound error gets resolved.

hongyuchen1030 commented 3 months ago

Updated Documentation and Tests:

  • User and internal API documentation

    • uxarray/docs/internal_api.index.rst
    • uxarray/docs/user_api.index.rst
  • User guide

    • uxarray/docs/user-guide/zonal-mean.ipynb
  • Enable user-defined latitude range
  • Unit test coverage for the internal APIs, where the _get_zonal_faces_weight_at_constLatfunction is simulated in the test code.

I am also encountering a ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results error in the tests. Due to this issue, I am currently bypassing the _get_zonal_faces_weight_at_constLat function, as it seems to be the source of the problem.

Should I go ahead and try to figure out why build_latlon_box is generating incorrect bounds?

Besides that, all the internal APIs seem to be working correctly; I will implement an integrating test after the bound error gets resolved.

Thank you very much for your work! If you get the ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results after you update the edge_nodes_cart helper function, then the problem might be in the new edge_nodes_cart implementation.

amberchen122 commented 3 months ago
base_path = "../../test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)
uxds['psi'].zonal_mean()

I was trying to reproduce this, but I could not load the data files. Do you know why this is happening @philipc2 ? image

philipc2 commented 3 months ago
base_path = "../../test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"
uxds = ux.open_dataset(grid_path, data_path)
uxds['psi'].zonal_mean()

I was trying to reproduce this, but I could not load the data files. Do you know why this is happening @philipc2 ? image

It's most likely due to the path being invalid. I'd double check the base path.

amberchen122 commented 2 months ago

@philipc2 test/test_api.py::TestAPI::test_open_mf_dataset is failing after I added test/meshfiles/ugrid/outCSne30/outCSne30_test2.nc

I think the reason is because https://github.com/UXARRAY/uxarray/blob/00a4988513ceb816baa3890449726ac4681db653/test/test_api.py#L27C5-L28C68 this hardcoded file path incorrectly matched outCSne30_test2.nc.

Can you please look into it? I opened an issue #860

erogluorhan commented 2 months ago

@philipc2 test/test_api.py::TestAPI::test_open_mf_dataset is failing after I added test/meshfiles/ugrid/outCSne30/outCSne30_test2.nc

I think the reason is because https://github.com/UXARRAY/uxarray/blob/00a4988513ceb816baa3890449726ac4681db653/test/test_api.py#L27C5-L28C68 this hardcoded file path incorrectly matched outCSne30_test2.nc.

Can you please look into it? I opened an issue #860

Please see our comments in that issue

philipc2 commented 2 months ago

In the process of review this now, I'll also add a simple ASV benchmark that we can run through this PR.

philipc2 commented 2 months ago

Here's an example I was able to generate

image

import matplotlib.pyplot as plt
from matplotlib import gridspec
import uxarray as ux

base_path = "../../test/meshfiles/ugrid/outCSne30/"
grid_path = base_path + "outCSne30.ug"
data_path = base_path + "outCSne30_vortex.nc"

uxds = ux.open_dataset(grid_path, data_path)

zonal_avg_psi = uxds['psi'].zonal_mean()

# create a figure
fig = plt.figure()

# to change size of subplot's
# set height of each subplot as 8
fig.set_figheight(5)

# set width of each subplot as 8
fig.set_figwidth(15)

# create grid for different subplots
spec = gridspec.GridSpec(ncols=2, nrows=1,
                         width_ratios=[5, 1], wspace=0.1)

# Global Plot
ax0 = fig.add_subplot(spec[0])
ax0.set_xlim((-180, 180))
ax0.set_ylim((-90, 90))
pc = uxds["psi"].to_polycollection(periodic_elements='exclude')
pc.set_cmap("plasma")
ax0.add_collection(pc)
cbar = fig.colorbar(pc, ax=ax0, orientation='vertical', location='left', shrink=0.8)
cbar.set_label('Colorbar Label')
ax0.set_title("PSI")

# Zonal Plot
ax1 = fig.add_subplot(spec[1])
ax1.set_ylim((-90, 90))
ax1.set_xlim((-1.5, 1.5))
ax1.grid()
ax1.set_title("PSI Zonal Mean")

ax1.plot(zonal_avg_psi.values, zonal_avg_psi.latitude_deg.values, lw=2)
philipc2 commented 2 months ago

The zonal average returns no intersections for this grid \uxarray\test\meshfiles\mpas\QU\oQU480.231010.nc, which is our MPAS Ocean mesh

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

hongyuchen1030 commented 2 months ago

The zonal average returns no intersections for this grid \uxarray\test\meshfiles\mpas\QU\oQU480.231010.nc, which is our MPAS Ocean mesh

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

what is the query latitude range for that? Can you provide me with your function call in details? Thanks

philipc2 commented 2 months ago

The zonal average returns no intersections for this grid \uxarray\test\meshfiles\mpas\QU\oQU480.231010.nc, which is our MPAS Ocean mesh

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

what is the query latitude range for that? Can you provide me with your function call in details? Thanks

I ran the default parameters for it.

philipc2 commented 2 months ago

The implementation also appears to fail for multi-dimensional data variables. For example,

image

Running zonal_average() produces the following error.

C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\grid.py:879: UserWarning: Constructing of `Grid.bounds` has not been optimized, which may lead to a long execution time.
  warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:61: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\arcs.py:336: RuntimeWarning: invalid value encountered in scalar divide
  d_a_max = (n1[2] * dot_n1_n2 - n2[2]) / denom
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:191: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\utils.py:122: UserWarning: The Jacobian matrix is singular.
  warnings.warn("The Jacobian matrix is singular.")
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:220: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:243: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[28], line 1
----> 1 v1_zonal_mean = v1.zonal_mean()

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\dataarray.py:1072, in UxDataArray.zonal_mean(self, lat_deg)
   1070     start_lat_deg, end_lat_deg, step_size_deg = lat_deg
   1071     # Call the function and get both latitudes and zonal means
-> 1072     latitudes, _zonal_avg_res = _non_conservative_zonal_mean_constant_latitudes(
   1073         face_edges_cart,
   1074         face_bounds,
   1075         data,
   1076         start_lat_deg,
   1077         end_lat_deg,
   1078         step_size_deg,
   1079         is_latlonface=is_latlonface,
   1080     )
   1081 # If lat is a single value, compute the zonal average for that latitude
   1082 else:
   1083     _zonal_avg_res = _non_conservative_zonal_mean_constant_one_latitude(
   1084         face_edges_cart, face_bounds, data, lat_deg, is_latlonface=is_latlonface
   1085     )

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\zonal.py:202, in _non_conservative_zonal_mean_constant_latitudes(face_edges_cart, face_bounds, face_data, start_lat_deg, end_lat_deg, step_size_deg, is_latlonface)
    199 # Reshape the zonal mean array to have the same leading dimensions as the input data
    200 # and an additional dimension for the latitudes
    201 expected_shape = face_data.shape[:-1] + (len(latitudes),)
--> 202 zonal_means = zonal_means.reshape(expected_shape)
    204 return latitudes, zonal_means

ValueError: cannot reshape array of size 37 into shape (1,20,37)
hongyuchen1030 commented 2 months ago

The implementation also appears to fail for multi-dimensional data variables. For example,

image

Running zonal_average() produces the following error.

C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\grid.py:879: UserWarning: Constructing of `Grid.bounds` has not been optimized, which may lead to a long execution time.
  warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:61: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\arcs.py:336: RuntimeWarning: invalid value encountered in scalar divide
  d_a_max = (n1[2] * dot_n1_n2 - n2[2]) / denom
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:191: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\utils.py:122: UserWarning: The Jacobian matrix is singular.
  warnings.warn("The Jacobian matrix is singular.")
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:220: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:243: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[28], line 1
----> 1 v1_zonal_mean = v1.zonal_mean()

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\dataarray.py:1072, in UxDataArray.zonal_mean(self, lat_deg)
   1070     start_lat_deg, end_lat_deg, step_size_deg = lat_deg
   1071     # Call the function and get both latitudes and zonal means
-> 1072     latitudes, _zonal_avg_res = _non_conservative_zonal_mean_constant_latitudes(
   1073         face_edges_cart,
   1074         face_bounds,
   1075         data,
   1076         start_lat_deg,
   1077         end_lat_deg,
   1078         step_size_deg,
   1079         is_latlonface=is_latlonface,
   1080     )
   1081 # If lat is a single value, compute the zonal average for that latitude
   1082 else:
   1083     _zonal_avg_res = _non_conservative_zonal_mean_constant_one_latitude(
   1084         face_edges_cart, face_bounds, data, lat_deg, is_latlonface=is_latlonface
   1085     )

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\zonal.py:202, in _non_conservative_zonal_mean_constant_latitudes(face_edges_cart, face_bounds, face_data, start_lat_deg, end_lat_deg, step_size_deg, is_latlonface)
    199 # Reshape the zonal mean array to have the same leading dimensions as the input data
    200 # and an additional dimension for the latitudes
    201 expected_shape = face_data.shape[:-1] + (len(latitudes),)
--> 202 zonal_means = zonal_means.reshape(expected_shape)
    204 return latitudes, zonal_means

ValueError: cannot reshape array of size 37 into shape (1,20,37)

This look like the array dimension issue to me. What does the multi-dimensional data variables mean here?The zonal average van only run query for one variable for a grid. For example, it can only obtain the average Temperature or average PSI each time, instead for averaging Temperature and PSI

hongyuchen1030 commented 2 months ago

The zonal average returns no intersections for this grid \uxarray\test\meshfiles\mpas\QU\oQU480.231010.nc, which is our MPAS Ocean mesh

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

what is the query latitude range for that? Can you provide me with your function call in details? Thanks

I ran the default parameters for it.

So this is the grid that has a hole inside it, rite?

philipc2 commented 2 months ago

The implementation also appears to fail for multi-dimensional data variables. For example, image Running zonal_average() produces the following error.

C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\grid.py:879: UserWarning: Constructing of `Grid.bounds` has not been optimized, which may lead to a long execution time.
  warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:61: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\arcs.py:336: RuntimeWarning: invalid value encountered in scalar divide
  d_a_max = (n1[2] * dot_n1_n2 - n2[2]) / denom
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:191: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\utils.py:122: UserWarning: The Jacobian matrix is singular.
  warnings.warn("The Jacobian matrix is singular.")
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:220: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:243: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[28], line 1
----> 1 v1_zonal_mean = v1.zonal_mean()

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\dataarray.py:1072, in UxDataArray.zonal_mean(self, lat_deg)
   1070     start_lat_deg, end_lat_deg, step_size_deg = lat_deg
   1071     # Call the function and get both latitudes and zonal means
-> 1072     latitudes, _zonal_avg_res = _non_conservative_zonal_mean_constant_latitudes(
   1073         face_edges_cart,
   1074         face_bounds,
   1075         data,
   1076         start_lat_deg,
   1077         end_lat_deg,
   1078         step_size_deg,
   1079         is_latlonface=is_latlonface,
   1080     )
   1081 # If lat is a single value, compute the zonal average for that latitude
   1082 else:
   1083     _zonal_avg_res = _non_conservative_zonal_mean_constant_one_latitude(
   1084         face_edges_cart, face_bounds, data, lat_deg, is_latlonface=is_latlonface
   1085     )

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\zonal.py:202, in _non_conservative_zonal_mean_constant_latitudes(face_edges_cart, face_bounds, face_data, start_lat_deg, end_lat_deg, step_size_deg, is_latlonface)
    199 # Reshape the zonal mean array to have the same leading dimensions as the input data
    200 # and an additional dimension for the latitudes
    201 expected_shape = face_data.shape[:-1] + (len(latitudes),)
--> 202 zonal_means = zonal_means.reshape(expected_shape)
    204 return latitudes, zonal_means

ValueError: cannot reshape array of size 37 into shape (1,20,37)

This look like the array dimension issue to me. What does the multi-dimensional data variables mean here?The zonal average van only run query for one variable for a grid. For example, it can only obtain the average Temperature or average PSI each time, instead for averaging Temperature and PSI

Let's say we have a data variable with time, lev and n_face dims. Running zonal_mean() should return the result as time, lev, n_lat, where n_lat is the number of latitude lines we are using for the computation.

Right now it looks like zonal_average only operates on 1-D slices of data.

philipc2 commented 2 months ago

Another example plot I was able to generate:

image

philipc2 commented 2 months ago

The zonal average returns no intersections for this grid \uxarray\test\meshfiles\mpas\QU\oQU480.231010.nc, which is our MPAS Ocean mesh

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

what is the query latitude range for that? Can you provide me with your function call in details? Thanks

I ran the default parameters for it.

So this is the grid that has a hole inside it, rite?

Holes as in that there is no grid cells over continents, yes, but all the geometries are valid (i.e. have no holes within them)

hongyuchen1030 commented 2 months ago

The implementation also appears to fail for multi-dimensional data variables. For example, image Running zonal_average() produces the following error.

C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\grid.py:879: UserWarning: Constructing of `Grid.bounds` has not been optimized, which may lead to a long execution time.
  warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:61: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\arcs.py:336: RuntimeWarning: invalid value encountered in scalar divide
  d_a_max = (n1[2] * dot_n1_n2 - n2[2]) / denom
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:191: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\utils.py:122: UserWarning: The Jacobian matrix is singular.
  warnings.warn("The Jacobian matrix is singular.")
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:220: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:243: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[28], line 1
----> 1 v1_zonal_mean = v1.zonal_mean()

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\dataarray.py:1072, in UxDataArray.zonal_mean(self, lat_deg)
   1070     start_lat_deg, end_lat_deg, step_size_deg = lat_deg
   1071     # Call the function and get both latitudes and zonal means
-> 1072     latitudes, _zonal_avg_res = _non_conservative_zonal_mean_constant_latitudes(
   1073         face_edges_cart,
   1074         face_bounds,
   1075         data,
   1076         start_lat_deg,
   1077         end_lat_deg,
   1078         step_size_deg,
   1079         is_latlonface=is_latlonface,
   1080     )
   1081 # If lat is a single value, compute the zonal average for that latitude
   1082 else:
   1083     _zonal_avg_res = _non_conservative_zonal_mean_constant_one_latitude(
   1084         face_edges_cart, face_bounds, data, lat_deg, is_latlonface=is_latlonface
   1085     )

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\zonal.py:202, in _non_conservative_zonal_mean_constant_latitudes(face_edges_cart, face_bounds, face_data, start_lat_deg, end_lat_deg, step_size_deg, is_latlonface)
    199 # Reshape the zonal mean array to have the same leading dimensions as the input data
    200 # and an additional dimension for the latitudes
    201 expected_shape = face_data.shape[:-1] + (len(latitudes),)
--> 202 zonal_means = zonal_means.reshape(expected_shape)
    204 return latitudes, zonal_means

ValueError: cannot reshape array of size 37 into shape (1,20,37)

This look like the array dimension issue to me. What does the multi-dimensional data variables mean here?The zonal average van only run query for one variable for a grid. For example, it can only obtain the average Temperature or average PSI each time, instead for averaging Temperature and PSI

Let's say we have a data variable with time, lev and n_face dims. Running zonal_mean() should return the result as time, lev, n_lat, where n_lat is the number of latitude lines we are using for the computation.

Right now it looks like zonal_average only operates on 1-D slices of data.

The current implementation of zonal_mean() is just the wrap-around of the core function _non_conservative_zonal_mean_constant_latitudes( face_edges_cart, face_bounds, data, start_lat_deg, end_lat_deg, step_size_deg, is_latlonface=is_latlonface, )

This is a function used to treat 1-D array. Can you help with the zonal_mean() API implementation for the multi-dimensional data you were talking about? Since probably you have a better idea about what the API should look like. Thanks

philipc2 commented 2 months ago

This grid also produces some interesting results: every value is nan

import geocat.datafiles as geodf
import uxarray as ux 
datafiles = (
    geodf.get(
        "netcdf_files/MPAS/FalkoJudt/dyamond_1/30km/diag.2016-08-20_00.00.00_subset.nc"
    ),
    geodf.get("netcdf_files/MPAS/FalkoJudt/dyamond_1/30km/x1.655362.grid_subset.nc"),
)
uxds = ux.open_dataset(datafiles[1], datafiles[0])
uxds

relhum_zonal_avg = uxds['relhum_200hPa'][0].zonal_mean()

image

This is the global plot for reference

image

philipc2 commented 2 months ago

The implementation also appears to fail for multi-dimensional data variables. For example, image Running zonal_average() produces the following error.

C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\grid.py:879: UserWarning: Constructing of `Grid.bounds` has not been optimized, which may lead to a long execution time.
  warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:61: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\arcs.py:336: RuntimeWarning: invalid value encountered in scalar divide
  d_a_max = (n1[2] * dot_n1_n2 - n2[2]) / denom
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:191: UserWarning: The C/C++ implementation of FMA in MS Windows is reportedly broken. Use with care. (bug report: https://bugs.python.org/msg312480)The single rounding cannot be guaranteed, hence the relative error bound of 3u cannot be guaranteed.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\utils.py:122: UserWarning: The Jacobian matrix is singular.
  warnings.warn("The Jacobian matrix is singular.")
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:220: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
C:\Users\chmie\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\grid\intersections.py:243: UserWarning: The intersection point cannot be converged using the Newton-Raphson method. The initial guess intersection point is used instead, procced with caution.
  warnings.warn(
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[28], line 1
----> 1 v1_zonal_mean = v1.zonal_mean()

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\dataarray.py:1072, in UxDataArray.zonal_mean(self, lat_deg)
   1070     start_lat_deg, end_lat_deg, step_size_deg = lat_deg
   1071     # Call the function and get both latitudes and zonal means
-> 1072     latitudes, _zonal_avg_res = _non_conservative_zonal_mean_constant_latitudes(
   1073         face_edges_cart,
   1074         face_bounds,
   1075         data,
   1076         start_lat_deg,
   1077         end_lat_deg,
   1078         step_size_deg,
   1079         is_latlonface=is_latlonface,
   1080     )
   1081 # If lat is a single value, compute the zonal average for that latitude
   1082 else:
   1083     _zonal_avg_res = _non_conservative_zonal_mean_constant_one_latitude(
   1084         face_edges_cart, face_bounds, data, lat_deg, is_latlonface=is_latlonface
   1085     )

File ~\anaconda3\envs\ncar-uxarray\Lib\site-packages\uxarray\core\zonal.py:202, in _non_conservative_zonal_mean_constant_latitudes(face_edges_cart, face_bounds, face_data, start_lat_deg, end_lat_deg, step_size_deg, is_latlonface)
    199 # Reshape the zonal mean array to have the same leading dimensions as the input data
    200 # and an additional dimension for the latitudes
    201 expected_shape = face_data.shape[:-1] + (len(latitudes),)
--> 202 zonal_means = zonal_means.reshape(expected_shape)
    204 return latitudes, zonal_means

ValueError: cannot reshape array of size 37 into shape (1,20,37)

This look like the array dimension issue to me. What does the multi-dimensional data variables mean here?The zonal average van only run query for one variable for a grid. For example, it can only obtain the average Temperature or average PSI each time, instead for averaging Temperature and PSI

Let's say we have a data variable with time, lev and n_face dims. Running zonal_mean() should return the result as time, lev, n_lat, where n_lat is the number of latitude lines we are using for the computation. Right now it looks like zonal_average only operates on 1-D slices of data.

The current implementation of zonal_mean() is just the wrap-around of the core function _non_conservative_zonal_mean_constant_latitudes( face_edges_cart, face_bounds, data, start_lat_deg, end_lat_deg, step_size_deg, is_latlonface=is_latlonface, )

This is a function used to treat 1-D array. Can you help with the zonal_mean() API implementation for the multi-dimensional data you were talking about? Since probably you have a better idea about what the API should look like. Thanks

Sure! I can handle getting it to work for n-dimensions.

hongyuchen1030 commented 2 months ago

The zonal average returns no intersections for this grid \uxarray\test\meshfiles\mpas\QU\oQU480.231010.nc, which is our MPAS Ocean mesh

ValueError: No intersections are found for the face, please make sure the build_latlon_box generates the correct results

what is the query latitude range for that? Can you provide me with your function call in details? Thanks

I ran the default parameters for it.

So this is the grid that has a hole inside it, rite?

Holes as in that there is no grid cells over continents, yes, but all the geometries are valid (i.e. have no holes within them)

Thanks for the information about the failing files cases. It looks like the pre-treatment process wasn't performed correctly so far. Me and @amberchen122 will look into that this week

hongyuchen1030 commented 2 months ago

\meshfiles\mpas\QU\oQU480.231010.nc

Hi @philipc2, I wonder how you call the zonal mean for the MPAS data file?

So for the zonal mean, it requires one grid file that provides the connectivity, and a data file provides the data for each grid. And I assume oQU480.231010.nc is the data file? Please provide me with your function call so I can understand the situation better, thanks

hongyuchen1030 commented 2 months ago
import geocat.datafiles as geodf

And also please send me the files you use in this test case so I can look into it. Thanks

hongyuchen1030 commented 2 months ago

Hi @philipc2,

I was examining the MPAS data file and noticed that the node_x, node_y, and node_z variables are all non-normalized. This non-normalization conflicts with our package's definition of a "unit sphere," which is essential for our implementation.

For our structure grid, the node_x, node_y, and node_z variables must be normalized. This uniformity is the foundation of our implementation. I understand that MPAS might want to retain the original data, but it would be beneficial to store it elsewhere.

One solution could be to create additional variables named normalized_node_x, normalized_node_y, and normalized_node_z.

philipc2 commented 2 months ago

Hi @philipc2,

I was examining the MPAS data file and noticed that the node_x, node_y, and node_z variables are all non-normalized. This non-normalization conflicts with our package's definition of a "unit sphere," which is essential for our implementation.

For our structure grid, the node_x, node_y, and node_z variables must be normalized. This uniformity is the foundation of our implementation. I understand that MPAS might want to retain the original data, but it would be beneficial to store it elsewhere.

One solution could be to create additional variables named normalized_node_x, normalized_node_y, and normalized_node_z.

This makes a lot of sense, and explains why it's behaving that way.

I wouldn't add new variables to our Grid class, instead having a method that can normalize the coordinates in place.

# normalizes face/node/edge x/y/z in place
uxgrid.normalize_cartesiain_coordinates()

Is there a quick way to check if coordinates are normalized? We can add this check that raises an exception upon calling .zonal_mean()

hongyuchen1030 commented 2 months ago

Hi @philipc2, I was examining the MPAS data file and noticed that the node_x, node_y, and node_z variables are all non-normalized. This non-normalization conflicts with our package's definition of a "unit sphere," which is essential for our implementation. For our structure grid, the node_x, node_y, and node_z variables must be normalized. This uniformity is the foundation of our implementation. I understand that MPAS might want to retain the original data, but it would be beneficial to store it elsewhere. One solution could be to create additional variables named normalized_node_x, normalized_node_y, and normalized_node_z.

This makes a lot of sense, and explains why it's behaving that way.

I wouldn't add new variables to our Grid class, instead having a method that can normalize the coordinates in place.

# normalizes face/node/edge x/y/z in place
uxgrid.normalize_cartesiain_coordinates()

Is there a quick way to check if coordinates are normalized? We can add this check that raises an exception upon calling .zonal_mean()

Hi @philipc2, I was examining the MPAS data file and noticed that the node_x, node_y, and node_z variables are all non-normalized. This non-normalization conflicts with our package's definition of a "unit sphere," which is essential for our implementation. For our structure grid, the node_x, node_y, and node_z variables must be normalized. This uniformity is the foundation of our implementation. I understand that MPAS might want to retain the original data, but it would be beneficial to store it elsewhere. One solution could be to create additional variables named normalized_node_x, normalized_node_y, and normalized_node_z.

This makes a lot of sense, and explains why it's behaving that way.

I wouldn't add new variables to our Grid class, instead having a method that can normalize the coordinates in place.

# normalizes face/node/edge x/y/z in place
uxgrid.normalize_cartesiain_coordinates()

Is there a quick way to check if coordinates are normalized? We can add this check that raises an exception upon calling .zonal_mean()

Thank you for your response. I appreciate your insights. Here are my thoughts:

  1. It's crucial for our project to maintain a consistent definition for the node_x, node_y, and node_z variables. Having these variables sometimes normalized and sometimes not can lead to significant issues and inconsistencies.
  2. Normalizing the coordinates in place will alter the node_x, node_y, and node_z variables in the Grid class, leading to potential inconsistency.
  3. There isn't a robust and quick way to ensure that all coordinates are normalized. The quickest method is to check if any dimension's absolute value exceeds 1, but this isn't entirely reliable.
  4. Ensuring uniform behavior is essential for the robustness of our project. I'll create an issue for this #872 so we can discuss it further.

Thank you very much

hongyuchen1030 commented 2 months ago

Hi @philipc2 @erogluorhan,

I have re-examined the zonal_mean algorithms and confirmed that they successfully process the uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc dataset with use_dual = False and the GEOCAT data.

However, the current CI failure is due to faces 25 and 26 in the MPAS dataset having only one valid node when use_dual is set to True. This is illustrated in the attached image showing the face_nodes_connectivity information for uxgrid = ux.Grid.from_dataset(xrds, use_dual=True).

Could you please provide your thoughts on ensuring data input validity to handle such cases appropriately?

Face Nodes Connectivity

philipc2 commented 2 months ago

Hi @philipc2 @erogluorhan,

I have re-examined the zonal_mean algorithms and confirmed that they successfully process the uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc dataset with use_dual = False and the GEOCAT data.

However, the current CI failure is due to faces 25 and 26 in the MPAS dataset having only one valid node when use_dual is set to True. This is illustrated in the attached image showing the face_nodes_connectivity information for uxgrid = ux.Grid.from_dataset(xrds, use_dual=True).

Could you please provide your thoughts on ensuring data input validity to handle such cases appropriately?

Face Nodes Connectivity

We encountered this in #359 a while back and haven't come up with a good solution for it yet. I would suggest not using the dual grid of that specific file. The primal mesh is okay to test on though.