Generalize vorticity (and related functions) to allow for map projections

Unidata / MetPy

MetPy is a collection of tools in Python for reading, visualizing and performing calculations with weather data.

https://unidata.github.io/MetPy/

BSD 3-Clause "New" or "Revised" License

1.26k stars 415 forks source link

Generalize vorticity (and related functions) to allow for map projections #893

Closed sgdecker closed 1 year ago

sgdecker commented 6 years ago

Some MetPy functions, such as gradient and advection, allow the user to supply arrays of deltas rather than a single value, but vorticity (and its relatives) does not. Vorticity is more complicated since derivatives of the "map factors" are also required, but it should be doable.

Here is the necessary computation according to the GEMPAK documentation (Appendix B2): $\zeta = m_x (\frac{\partial v}{\partial x})-m_y(\frac{\partial u}{\partial y})-v\frac{m_x}{m_y}(\frac{\partial m_y}{\partial x})+u\frac{m_y}{m_x}(\frac{\partial m_x}{\partial y})$

Since the map factors (m_x and m_y) are equivalent to the "nominal" grid spacing (\partial x and \partial y in the above equation) divided by the actual delta, an additional "nominal_delta" argument to vorticity would allow for the map factors to be computed. Since the WRF model provides the map factors in its output, allowing the map factors (and the nominal grid spacing) as optional arguments to vorticity in lieu of deltas would be useful as well.

dopplershift commented 6 years ago

So while the docs for vorticity currently say that dx/dy have to be float (oops, that needs fixing), with the new finite differencing in 0.7, we can now handle vorticity for arbitrarily-spaced points. If you've tried to pass an array in for dx/dy, and it failed, that's a bug.

As far as the mapping/projection issues are concerned, that's still an open issue (covered in #174). So I'm a little unclear--is the map factor there to handle irregularly-spaced points or to handle issues from projection? or both?

sgdecker commented 6 years ago

OK, I see that vorticity does compute something when I provide arrays for dx/dy.

Regarding the map factors, they are fundamentally there to handle issues from projection. One of those issues is that the data array is irregularly spaced, but another issue is that computing vorticity (more generally, the curl of a vector) in a quantitatively correct manner requires the map factors.

The current implementation of vorticity is correct if the underlying data is truly in Cartesian coordinates (even if unequally spaced). I suppose output from a cloud model would fit this description.

However, when the data is not truly Cartesian (e.g., NWP output on some map projection), the third and fourth terms above are necessary for a quantitatively correct calculation. In many (most?) instances, the third and fourth terms will be quite small, certainly not noticeable qualitatively (say, when looking at contours), but one case where the difference is noticeable is when vorticity is computed from data on a lat/lon grid and examined in polar regions (where, due to convergence of the meridians, the dmx/dy term gets large).

For instance, compare the GEMPAK and MetPy-derived plots below (Python code is at https://gist.github.com/sgdecker/496f7ea7edd98b428ac3adab0b5e0842; GEMPAK is at https://gist.github.com/sgdecker/1a0bbf2e18dd0a2f9b4b02c83e07509b): np_vort gfs_vort Look closely and you'll see, for instance, that the 4 contour to the left (south, technically) of the vort max in the Arctic Ocean differs noticeably.

A mathematical way to see this is to note that vorticity in spherical coordinates is given by (with theta as colatitude): $\frac{1}{r \sin \theta}(\frac{\partial}{\partial\theta}(u\sin\theta)-\frac{\partial v}{\partial \psi})$ which expands to: $\frac{1}{r}\frac{\partial u}{\partial \theta}-\frac{1}{r \sin \theta}\frac{\partial v}{\partial \psi}+\frac{u}{r}\cot\theta$ The final term (which corresponds to the dmx/dy term in GEMPAK's general expression) blows up at the pole, but if we are 1 degree away, the cotangent is order 10^2, so using U ~ 10 m/s, and r ~ 10^7 m, the term is on the order of 10^-4 s^(-1).

The divergence and laplacian operators will have the same issue.

dopplershift commented 6 years ago

Thanks for the detailed analysis. It's what I figured, and definitely on our roadmap. Due to the nicely detailed analysis, let's leave this open for addressing the map factors, and we can close out #174.

(Putting the link to the NOAA writeup about map projections here: https://www.arl.noaa.gov/hysplit/cmapf-mapping-routines/)

deeplycloudy commented 6 years ago

Nice resource in the ARL guide. I do want to suggest caution in retaining 3D coordinates as much as possible, as I advocated in the Cartopy PlateCaree saga - while the ARL guide has nice routines for 2D, and that's fine for maps, it can make forward/inverse and mixed-coordinate plotting difficult.

Anyway, proj.4 has a 3D-aware forward / inverse, and a quick search turns up partial derivative support through the proj_factors function - maybe that's useful in ensuring we account for what @sgdecker points out above. This issue on OSGeo also seems related.

sgdecker commented 6 years ago

Since I mentioned divergence and the Laplacian operator, I went ahead and made plots of those as well to show the difference between GEMPAK and MetPy. Here is 500-mb divergence: div500 gfs_div Here is the Laplacian of 500-mb height: lap500 gfs_lap

Perusing the GEMPAK documentation, I found other calculations that also require taking derivatives of map factors to get absolutely correct results:

Horizontal partial derivatives of vectors (but not scalars), which in turn affects:
- Q-vectors
Deformation (total, stretching, and shearing), which in turn affects:
- Frontogenesis (also depends on divergence)
Inertial advection, which in tern affects:
- Rossby number
Divergence-dependent calculations, such as:
- Flux divergence
- Layer-average mass divergence
Vorticity-dependent calculations, such as:
- Absolute vorticity
- Potential vorticity
Poisson equation solver

kgoebber commented 5 years ago

Okay, so I've been looking into this and have the following to report on absolute vorticity...

Here is the difference between GEMPAK and MetPy v0.10; note the very small range of color values GEMPAK_difference_ABSVORT_v0 10

The average difference is 0.0051 /s. If the following term is added + (uwnd / (6375000 * units.meter)) * np.tan(np.deg2rad(lats[:, None])) the difference is reduced by an order of magnitude to essentially zero. GEMPAK_difference_ABSVORT_spherical

The average difference is -0.00042. Note: The single strip of color at the poles is because MetPy's derivative does not assess vorticity at the top and bottom of the array. Excluding those rows does not appreciably affect the mean difference calculations.

The additive factor depends on the u-component of the wind and latitude and it is not a universal or 3D solution to the issue as it will has the potential to be slightly different for some of the calculations (e.g., laplacian). Since the list is "relatively" small, this can be done for the various cases and added, however with our current implementation it would require the addition of potentially two arguments. One argument signaling to use (or not use) the spherical calculation and another that contains the latitude array. This would not be a problem for absolute_vorticity since it already requires the latitudes for the calculation of coriolis_parameter, therefore we would only need to add the spherical designation. I would additionally promote the default use of spherical to beTrue since that would be the much large use case for our current user base.

sgdecker commented 5 years ago

I haven't tried implementing this yet, but I think the general solution would be to accept the data CRS object as an optional argument, rather than more specific things such as latitude. If that argument is not provided, assume the grid is Cartesian, and set the map factors to one. Otherwise, use the CRS object to obtain the map factors via proj4 (as discussed by @deeplycloudy). In either case, use the formula in my original comment as the basis for the computation. The part that is unclear to me without actually trying this yet is how difficult obtaining the map factors from the CRS is.

jthielen commented 5 years ago

This question is more related to the information needed for projection-aware deriviative-based calculations, rather than the implementation, but I still wanted to ask, since it may end up being useful for the implementation.

I've recently been planning out a collection of "dataset helpers" to flesh out any missing CF metadata in xarray Datasets, so that all of the MetPy-xarray integration functionality (projections, coordinates, units, and hopefully soon variable identification and the field solver) works as smoothly as possible. WRF output has been the motivating example (see https://github.com/Unidata/MetPy/issues/1004).

With that in mind, what would be the best set of information to standardize having in a DataArray and/or Dataset to ensure that these projection-aware calculations can be carried out? Is it sufficient to have

projection x and y coordinates with a crs?
latitude and longitude dimension coordinates (if plate carrée/equirectangular) or auxiliary coordinates (if not)?

or am I missing something else?

The goal, I would think, would be to be able to calculate something like vorticity with a call such as

vorticity = mpcalc.vorticity(data['u_wind'], data['v_wind'])

and have everything else be obtained automatically from the DataArrays. This, if combined with some way of annotating the function and its arguments, would also really set the stage for the field solver.

dopplershift commented 5 years ago

Correct, my ideal is to have the CRS information, as well as coordinates like x or lon, attached to the data arrays. If we can't get map factors from the CRS...that would be bad.

jthielen commented 4 years ago

While I had hoped that #1260 (which allows for easy calculation of both nominal and actual grid deltas from a parsed DataArray) would be enough to make this feasible for quick implementation for v1.0, #1275 raised the point about grid-relative vs. earth-relative winds that I had not fully considered with respect to this issue. So,

What is the correct behavior for these kinematic calculations with respect to earth-relative vs. grid-relative winds?
When working with geostrophic wind, how do we know when the result should be grid-relative (like it is now) or earth-relative (like would be needed for ageostrophic wind when the observed wind is earth-relative)?
Do we have any test data with expected results for these calculations that we can rely on here?

With all this extra complexity, though, I unfortunately don't see a way that this can feasibly be resolved with a stable API for v1.0 in the immediate future. So, will this issue need to be punted from v1.0, and instead, for now, just use a simple xarray input/output wrapper that fills in the default grid spacing (dx and dy) and coordinate-dependent parameters (f and latitude), and calculates these kinematic functions as they are now?

sgdecker commented 4 years ago

The equation way back in the first comment is valid for grid-relative u and v. If you used earth-relative winds on, say, a polar stereographic projection where North America is right-side up but Asia is upside down (so that grid-relative and earth-relative winds are nearly equal and opposite), I'd suspect adding the map factor terms would actually degrade the computation.

dopplershift commented 4 years ago

@jthielen I agree about punting from 1.0. I'll do some milestone work later, but it's pretty clear we're not going to get the 1.0 we want, but the one we deserve. :wink: We can talk more when you get into AMS. Let me know.

jthielen commented 4 years ago

@jthielen I agree about punting from 1.0. I'll do some milestone work later, but it's pretty clear we're not going to get the 1.0 we want, but the one we deserve. We can talk more when you get into AMS. Let me know.

Sounds good. I'll be getting in tomorrow afternoon. We can chat via email if you want to find a more specific time, otherwise I'll plan on stopping by the Unidata table at the career fair that evening.

dopplershift commented 3 years ago

This notebook from @deeplycloudy may help shed some light on using PyProj for some complicated reprojections.

dcamron commented 3 years ago

@sgdecker would you be available to join our biweekly collaborators call (#1684) for some live exploration of potential solutions we have been discussing lately surrounding this issue?

dcamron commented 3 years ago

@kgoebber shared this notebook with us. We'll be using these two notebooks as a starting point for this discussion at our next dev call!

sgdecker commented 3 years ago

@dcamron Yes, I can join in on July 8.

deeplycloudy commented 3 years ago

For reference in advance of our discussion, I worked a bit more with @kgoebber's notebook and added a bunch of notes while trying to reconcile the two approaches to calculating curvilinear gradients. I wasn't entirely successful but hopefully my prose can help someone spot the way forward. I've forked and updated the notebook uploaded by @dcamron.

sgdecker commented 3 years ago

Using GEMPAK's value for the radius of the Earth marginally improves (by 0.02% or so, consistent with the radius differing in the fifth significant digit) the correspondence between MetPy and GEMPAK values in method one. I though switching to 32-bit arithmetic might further increase the match, but it does not. I am guessing the remaining difference comes from the loss of precision that occurred when generating the .out files.

sgdecker commented 3 years ago

Calculating the map factors as

m_x = 1 + (factors.parallel_scale - 1) / mpconst.earth_avg_radius.m
m_y = 1 + (factors.meridional_scale - 1) / mpconst.earth_avg_radius.m

brings things closer to GEMPAK in method two (RMS reduced from 0.03303 to 0.00108), but not nearly as close as method one (0.00002). This was a hunch on my part, but not having a good feel for what the pyproj meridional_scale and parallel_scale actually are yet, this transformation could be off base.

dopplershift commented 3 years ago

So based on today's dev call, here's the plan of action:

First implement the additional terms in gradient, divergence, vorticity (vertical component of curl), and scalar laplacian to account for operating on a spheroid--target 1.1 for this functionality
- This will put the corrections in the functions
- Add optional function arguments for latitude and CRS (to give us radius--defaults to our constant for radius) on those functions (Use the presence of latitude to trigger the correction)
- Add handling for CRS to add_grid_arguments_from_xarray
- This allows us to get accurate results by going from projected coordinates to lat/lon and doing that calculations in lat/lon space
In the future use PROJ's map factors to implement the calculations directly in projected coordinates more efficiently (provided we get everything right using factors)
- This might also allow us to properly work with grid-relative winds (#1047)--if the corrections are necessary with grid-relative winds (depends on whether the grid is Cartesian or not?)

(cc @jthielen @deeplycloudy @kgoebber )

kgoebber commented 3 years ago

Okay, so I've created an initial PR, please feel free to take it from there.

As I was working on this I delved into some NAM data and the vorticity calculation. Interestingly, the calculation matches what GEMPAK produces when no correction is made. The NAM data that I am using to test has grid-relative winds and does need to be converted to plot correctly, however, if you do the conversion for the calculation, it degrades the calculation. I haven't been able to unlock any combination of correction that improves over the base calculation that is already in MetPy. I haven't had time yet to create a refined notebook - I'll do that in the next day or two and update with the NAM example for others to peruse.

kgoebber commented 3 years ago

Here is a cleaned up notebook that I created to demonstrate what I have tried: https://gist.github.com/kgoebber/a5482e8e8c51396e80ffe3293696f57a

sgdecker commented 3 years ago

Makes sense that the spherical correction is no good for NAM 218, as that isn't a spherical grid. The fact that the last approach based on the map factors doesn't work so well is the mystery. I will be investigating further, but there is a nice big picture overview at https://gis.stackexchange.com/questions/122703/influence-of-the-scale-factor-on-the-projection (first answer) that I think has important info, but it is a matter of translating it into proj4 language to determine what is missing. My current hunch is that something involving the "nominal scale" needs to be accounted for.

sgdecker commented 3 years ago

My investigations are cataloged here: https://github.com/sgdecker/testing/blob/188ed819b4b464d5a3c1b83034435f0e6b9d800c/MapProjectionExploration.ipynb

sgdecker commented 3 years ago

I can now reproduce the GEMPAK calculation on Grid 218 using the proj4 map factors. I am not sure what I am doing differently from @kgoebber actually, but it works! Details are in this notebook, and the data files are in the repository.

sgdecker commented 3 years ago

Summary notebook demonstrating a working vorticity function for Grid 218 is here. Still need to revisit GFS global data and look at a stereographic grid.

jthielen commented 3 years ago

Just a side question, since such good progress is being made on the implementation side: for my own understanding's sake (and probably also documentation in the eventual implementation), is there a good reference for how these map-factor-based formulations come about? The GEMPAK appendix's "left as an exercise to the reader" is not very satisfactory, especially since it requires a working knowledge of differential geometry.

sgdecker commented 3 years ago

It is at heart differential geometry, but you can see the general expression in the table here (good luck going through the derivations to get to that table!), where h1 is mx, h2 is my, h3 is 1 (no transformation in the vertical), v1 is u, v2 is v, and we only care about the vertical component (i = 3). The expression for vorticity that the GEMPAK documentation shows should emerge from the expression in the table corresponding to the curl of a vector field with the aforementioned substitutions.

Edit: Actually, we have orthogonal coordinates, so the info above, while correct, is more general than it needs to be. This page is better.

kgoebber commented 3 years ago

Thanks for this additional investigation @sgdecker! I was able to match your calculation by using just the parallel_scale and meridional_scale for the map factors. I was originally using the modification that you had explored in a previous notebook.

sgdecker commented 3 years ago

OK, good to hear. So now it is a matter of determining why my modifications are necessary for the GFS grid...

kgoebber commented 3 years ago

On a hunch I tried the following for the GFS output and got a great match with the GEMPAK calculation

xx, yy = np.meshgrid(gfs_data.lon.values, gfs_data.lat.values)
factors = Proj(data_var_u.metpy.pyproj_crs).get_factors(xx, yy)

m_x = factors.parallel_scale
m_y = factors.meridional_scale

dx, dy = mpcalc.lat_lon_grid_deltas(gfs_data.lon, gfs_data.lat)
dy_grid = dy[0, 0]
dx_grid = -dy_grid

dudy = mpcalc.first_derivative(data_var_u, delta=dy_grid, axis=-2)
dvdx = mpcalc.first_derivative(data_var_v, delta=dx_grid, axis=-1)

The key was in setting the "grid spacing" as the value at the equator (which I just based off of the latitude difference, which doesn't change, and then setting the dx value based on that). See the notebook below...so I think we might have a single way to deal with this now!

https://gist.github.com/kgoebber/8f855516115fa8cad74e957d2c440e92

sgdecker commented 3 years ago

Ahh, that makes sense, and I suppose using the GEMPAK radius would add a few more 9s to the correlation. So now the question is whether this will work just as seamlessly with a stereographic grid, and I think we'll have all the common cases covered.

sgdecker commented 3 years ago

This notebook has been expanded to include a polar stereographic case (Grid 104). New content begins roughly halfway through. Unfortunately, I can't make sense out of the proj4 map factors in this case, and they don't come close to reproducing the GEMPAK calculation. Interestingly, both the native GEMPAK output and the MetPy-based calculations break down in the upper-left and upper-right corners of this grid, so at least that part is being reproduced!

Please take a look and see if I am missing something. I intend to try a netCDF file next (as @kgoebber did for Grid 218).

kgoebber commented 3 years ago

I'm capturing the equations needed for the map factor calculations. These are from the GEMPAK appendix B2. The only one that is not directly in the appendix is the Q-vector, which I computed from the partial x and y vectors. May want to have someone check my "derivation".

Partial x-vector $DVDX( \vec{v} ) = \bigg( m_x\frac{\partial u}{\partial x} - v\frac{m_y}{m_x}(\frac{\partial}{\partial y}m_x) \bigg)\hat{i} + \bigg(m_x \large(\frac{\partial v}{\partial x}) + u\frac{m_y}{m_x}(\frac{\partial}{\partial y}m_x)\bigg)\hat{j}$

Partial y-vector $DVDY( \vec{v} ) = \bigg( m_y\frac{\partial u}{\partial y} + v\frac{m_x}{m_y}(\frac{\partial}{\partial x}m_y) \bigg)\hat{i} + \bigg(m_y \large(\frac{\partial v}{\partial y}) - u\frac{m_x}{m_y}(\frac{\partial}{\partial x}m_y)\bigg)\hat{j}$

Divergence

$DIV( \vec{v} ) = \nabla \vec{v} = m_x\large(\frac{\partial u}{\partial x}) + m_y \large(\frac{\partial v}{\partial y}) - u\frac{m_x}{m_y}(\frac{\partial}{\partial x}m_y) - v\frac{m_y}{m_x}(\frac{\partial}{\partial y}m_x)$

Inertial Advection $INAD( \vec{v_1}, \vec{v_2} ) = (\vec{v_1} \cdot \nabla)\vec{v_2} = \bigg[ \vec{v_1} \cdot \nabla u_2 + v_1v_2 \frac{m_x}{m_y}\bigg(\frac{\partial}{\partial x}m_y \bigg) - u_1v_2 \frac{m_y}{m_x}\bigg(\frac{\partial}{\partial y}m_x \bigg) \bigg]\hat{i} + \bigg[ \vec{v_1} \cdot \nabla v_2 + u_1u_2 \frac{m_y}{m_x}\bigg(\frac{\partial}{\partial y}m_x \bigg) - v_1u_2 \frac{m_x}{m_y}\bigg(\frac{\partial}{\partial x}m_y \bigg) \bigg]\hat{j}$

Shearing Deformation $SHR( \vec{v} ) = m_x\frac{\partial v}{\partial x} + m_y\frac{\partial u}{\partial y} + v\frac{m_x}{m_y}\bigg(\frac{\partial}{\partial x}m_y \bigg) + u\frac{m_y}{m_x}\bigg(\frac{\partial}{\partial y}m_x \bigg)$

Stretching Deformation $STR( \vec{v} ) = m_x\frac{\partial u}{\partial x} - m_y\frac{\partial v}{\partial y} + u\frac{m_x}{m_y}\bigg(\frac{\partial}{\partial x}m_y \bigg) - v\frac{m_y}{m_x}\bigg(\frac{\partial}{\partial y}m_x \bigg)$

Vorticity $VOR( \vec{v} ) = \hat{k} \cdot \nabla \times \vec{v} = m_x\frac{\partial v}{\partial x} - m_y\frac{\partial u}{\partial y} - v\frac{m_x}{m_y}\bigg(\frac{\partial}{\partial x}m_y \bigg) + u\frac{m_y}{m_x}\bigg(\frac{\partial}{\partial y}m_x \bigg)$

Q-Vector $QVEC( s, \vec{v} ) = - \bigg(\frac{\partial \vec{v}}{\partial x} \cdot \nabla s \bigg)\hat{i} - \bigg(\frac{\partial \vec{v}}{\partial y} \cdot \nabla s \bigg)\hat{j} = -\bigg[\bigg(m_x\frac{\partial u}{\partial x} - v\frac{m_y}{m_x}\frac{\partial m_x}{\partial y} \bigg)\frac{\partial s}{\partial x} + \bigg(m_x\frac{\partial v}{\partial x} + u\frac{m_y}{m_x}\frac{\partial m_x}{\partial y} \bigg)\frac{\partial s}{\partial y} \bigg]\hat{i} -\bigg[\bigg(m_y\frac{\partial u}{\partial y} + v\frac{m_x}{m_y}\frac{\partial m_y}{\partial x} \bigg)\frac{\partial s}{\partial x} + \bigg(m_y\frac{\partial v}{\partial y} - u\frac{m_x}{m_y}\frac{\partial m_y}{\partial x} \bigg)\frac{\partial s}{\partial y} \bigg]\hat{j}$

kgoebber commented 3 years ago

Okay...had a few cycles this weekend and I think I have the spherical corrections working for GFS, NAM, NAM Stereographic, and GFS Stereographic projections. All but the global GFS work with the same code.

Process:

Read in data
Assign lat/lons to Xarray Dataset
Use lat/lon to create Proj object to get map factors
Define grid spacing using native x/y values
Compute spherical version of equation

# Read NAM Data
nam_data = xr.open_dataset('namanl_218_20180308_0000_000.nc').metpy.parse_cf()
nam_data = nam_data.metpy.assign_latitude_longitude()

# Get u/v component data
data_var_u = nam_data['u-component_of_wind_isobaric'].metpy.sel(time=datetime(2018, 3, 8, 0),
                                                                vertical=500*units.hPa).metpy.quantify()
data_var_v = nam_data['v-component_of_wind_isobaric'].metpy.sel(time=datetime(2018, 3, 8, 0),
                                                                vertical=500*units.hPa).metpy.quantify()

# Start Spherical Correction for NAM Vorticity
# Get lat/lon values from dataset
xx, yy = (nam_data.longitude.values, nam_data.latitude.values)

# Use Proj  object to get map factors
factors = Proj(data_var_u.metpy.pyproj_crs).get_factors(xx, yy)

# Isolate needed map factors
m_x = factors.parallel_scale
m_y = factors.meridional_scale

# Determine dx and dy grid spacing from x/y values
dx_grid = (data_var_u.x[1].values - data_var_u.x[0].values) * units.meters
dy_grid = (data_var_u.y[1].values - data_var_u.y[0].values) * units.meters
print(dy_grid)

# Compute main function derivatives
dudy = mpcalc.first_derivative(data_var_u, delta=dy_grid, axis=-2)
dvdx = mpcalc.first_derivative(data_var_v, delta=dx_grid, axis=-1)

# Use main function derivatives to calculate using Map Factors
zeta = (m_x * dvdx - m_y * dudy
        - data_var_v * (m_x/m_y) * mpcalc.first_derivative(m_y, delta=dx_grid, axis=-1)
        + data_var_u * (m_y/m_x) * mpcalc.first_derivative(m_x, delta=dy_grid, axis=-2))

Unknown

To see full details and some figures, take a look at the following gist notebook: https://gist.github.com/kgoebber/f479ee45ae83c42456721169804c82ec

kgoebber commented 3 years ago

In looking at @sgdecker previous notebook, I think the error that he was seeing occurs in the cell that was run 37th (e.g., In[37]) as there was a call to an lcc_proj variable where that should have been a stereographic projection. Looking at the Map Factors graphic then makes a little sense as it appears to make a cone with the highest map factors at the center top position.

Anyway, I think we have it solved (for vorticity at the least), now we have to decide how this would be best implemented within MetPy. Currently, we do assign lat/lon values to all data arrays before calculation, but then we also add the dx/dy based on the lat_lon_grid_deltas, which we don't want to do in this case. There is also the difference between calculating the grid dx (dx_grid) between the GFS, that natively brings lat/lon and everything else since it doesn't have projection x and y coordinates, and all of the others. As I get a few more cycles I will attempt to get versions of all of the equations affected (listed in a previous comment) working with map factors, unless someone else wants to take the baton from here.

The PR that I had previously started on this topic, #1963, should be closed and a new one completed as we'll need to go in a completely different direction.

sgdecker commented 3 years ago

In looking at @sgdecker previous notebook, I think the error that he was seeing occurs in the cell that was run 37th (e.g., In[37]) as there was a call to an lcc_proj variable where that should have been a stereographic projection. Looking at the Map Factors graphic then makes a little sense as it appears to make a cone with the highest map factors at the center top position.

Thanks, @kgoebber ! That was a bug for sure. Indeed, when I put grid104 there instead (which should have been the case all along), all the discrepancies in my notebook disappear.

I guess these functions will need conditionals that determine when to apply the map factors and when not to. Or, perhaps they set the map factors to 1 in the cases where they aren't needed so that the actual calculation is the same in all cases.

kgoebber commented 2 years ago

Okay, so I have compiled the vast majority of calculations affected by the spherical calculations. What this really all comes down to is whether we are doing a derivative on a vector quantity...so I have created a notebook that contains a vector_gradient function that computes the appropriate vector derivatives and then I use that throughout the notebook to calculate the various quantities.

I've primarily focused on the GFS calculations since I had them all done for the comparisons to GEMPAK calculations. I have also included the vorticity calculation for the NAM projection, NAM Stereographic projection, and GFS stereographic projection. There is a slight nuance to calculating the grid dx and dy between the lat/lon grid of the GFS and the other projected grids, which is captured in the vector_gradient function.

We can discuss potential implementation options on the next dev call.

Notebook is hosted at: https://gist.github.com/kgoebber/b3546977e55fef96d4b36f4ef573a788

jthielen commented 2 years ago

@kgoebber Question on the most recent notebook: In the dx and dy calculation for the latitude_longitude grid, you have

earth_radius = 6371200 * units.meter
dx = 2*np.pi*earth_radius / u.lon.size
dy = -dx

However, this looks to only be valid when assuming a spherical earth with radius 6371.2 km, longitude with global extent, and equal spacing in latitude and longitude. Would it be reasonable to generalize this to the following?

geod = u.metpy.pyproj_crs.get_geod()
dx = units.Quantity(
    geod.a * np.diff(u.metpy.longitude.metpy.unit_array.m_as('radian')),
    'meter'
)
lat = u.metpy.latitude.metpy.unit_array.m_as('radian')
lon_meridian_diff = np.zeros(len(lat) - 1)
forward_az, _, dy = geod.inv(
    lon_meridian_diff, lat[:-1], lon_meridian_diff, lat[1:], radians=True
)
dy[(forward_az < -90.) | (forward_az > 90.)] *= -1
dy = units.Quantity(dy, 'meter')

Essentially, no matter what subset or spacing of longitudes and latitudes, this calculates the dx from longitude on the equator of the ellipsoid and dy from latitude on the 0 degree meridian of the ellipsoid.

kgoebber commented 2 years ago

Ah yes, I have only been working with the global GFS, so the logic would break down for smaller subsets.

A quick test of your generalization was good except for the dy were not negative as they needed to be. All forward azimuths were 3.1415926 (Pi because of 1 degree spacing?), so don't know if there is some different logic with the forward_az or just multiplying dy by -1 is best.

I don't match the GEMPAK values as well, but that would be expected because we are not using the radius that they define. Despite that, it is different by only a very very small amount (what would be expected by using a different radius value.

jthielen commented 2 years ago

A quick test of your generalization was good except for the dy were not negative as they needed to be. All forward azimuths were 3.1415926 (Pi because of 1 degree spacing?), so don't know if there is some different logic with the forward_az or just multiplying dy by -1 is best.

Yeah, not sure where that sign/direction error would be coming from. I'd be tempted to just reverse the directionality in the inv call (swap lat[:-1] and lat[1:]) to take the latitude differences backwards, but that feels wrong without understanding why the forward differences are giving the wrong result.

I don't match the GEMPAK values as well, but that would be expected because we are not using the radius that they define. Despite that, it is different by only a very very small amount (what would be expected by using a different radius value.

Does specifying the matching ellipsoid fix it? I.e., replace

gfs_data = xr.open_dataset('gfs_test_data.nc').metpy.parse_cf()

with

gfs_data = xr.open_dataset('gfs_test_data.nc').metpy.assign_crs({
    'grid_mapping_name': 'latitude_longitude',
    'earth_radius': 6371200
})

Also, I hacked away at a possible implementation: https://github.com/jthielen/MetPy/commit/aac4e8e0454f229a4c1dcd4cc186057447d0b8fe. Right now,

vorticity is the only modified calculation,
the grid argument decorator has blown up in complexity,
I think will not work right on functions where latitude is only required for map factor calculations,
and there are no tests whatsoever.

But, it's at least getting the ideas I had down in code, so if anyone wanted to take advantage of it as a head start towards an actual PR, go ahead! If not, I can try further iterating on it, but not sure when I'd be able to do so next.

kgoebber commented 2 years ago

All checks out with the specific radius. Actually improves it marginally!

sgdecker commented 2 years ago

@kgoebber Looks great! Thanks for putting all this together. It looks like there is some difference between the GEMPAK and MetPy PV calculations that isn't attributable to the map projection. Is that a known discrepancy (however minor it may be)?

dcamron commented 2 years ago

@dopplershift and I chat about this earlier today; I'm going to spin off @jthielen's commit above into a draft PR this week and begin the process of implementing it across functions, examining existing tests, and adding new tests. If you are already working on a PR, let me know!

jthielen commented 2 years ago

Not sure if this was the impression of other folks or not, but I previously conceptualized these map factor corrections as only occurring in derivatives on vector quantities. https://github.com/Unidata/MetPy/issues/2356#issuecomment-1048324949 made me realize this does not seem to be the case; map factors need to be considered in any vector operation (so, also gradients of scalar fields, dot products, cross products, etc.). However, these corrections are much simpler (just coefficients on terms and they sometimes cancel out).

jthielen commented 2 years ago

I keep on getting nerd sniped with this topic of "vector calculus on orthogonal coordinates," and while I don't have much new to show for it yet, one particular concern came up:

Do we need to delineate between vector component arguments that are scaled according to the covariant basis versus the normalized basis? (See https://en.wikipedia.org/wiki/Orthogonal_coordinates#Covariant_basis for discussion.) Is it safe to assume we will always have things in the normalized basis?

sgdecker commented 2 years ago

Do we need to delineate between vector component arguments that are scaled according to the covariant basis versus the normalized basis? (See https://en.wikipedia.org/wiki/Orthogonal_coordinates#Covariant_basis for discussion.) Is it safe to assume we will always have things in the normalized basis?

For what it's worth, the Pielke textbook says (p. 129 of the second edition) "By convention, the equations [the set of equations solved in an NWP model] are written in the contravariant form, using the covariant differentiation operation..." but that is in the context of a whole bunch of derivations. I think in practice all of the vector quantities MetPy would be dealing with are normalized (e.g., if the u-component of the wind is given as 20 m/s, it really is 20 m/s at that point).

jthielen commented 2 years ago

On the topic of frequently having this problem in the back of my mind, I ended up writing up a rough proposal for addressing this "grid-correct vector calculus" problem on the data model level: https://github.com/pydata/xarray/discussions/6331. I doubt that all would be practical on the timeline we want to have this functionality implemented in MetPy, but, it could be a cool future goal?

dopplershift commented 2 years ago

I doubt that all would be practical on the timeline we want to have this functionality implemented in MetPy, but, it could be a cool future goal?

👍 to all of that