MITgcm / xmitgcm

Read MITgcm mds binary files into xarray
http://xmitgcm.readthedocs.io
MIT License
57 stars 66 forks source link

U,V are not always the zonal and meridional components of velocity #204

Open antonimmo opened 4 years ago

antonimmo commented 4 years ago

I understand this might be inherited from LLC faces, but I didn't expect it as the description of the variables is a bit misleading.

This works well on faces 0-6, but for faces 6-12 it seems U is actually the "local X" component of the velocity, while V is the "local Y" component (local = tile axis). So it means that V is the zonal component (positive = eastward), while -U is the meridional component (positive = northward).

For instance, for face 10:

import matplotlib.colors as colors
from xmitgcm import llcreader

model = llcreader.ECCOPortalLLC4320Model()
tst = model.get_dataset(varnames=['U','V'], k_chunksize=1, type="llc")
unorm = colors.DivergingNorm(vmin=-2, vcenter=0, vmax=2)

plt.figure(figsize=(12,10))
ds.U.isel(k=0,face=10,time=200*24).plot(norm=unorm)
plt.show() ## Figure 1

plt.figure(figsize=(12,10))
V10model.plot(norm=unorm)
plt.show() ## Figure 2

Figure 1 (U) image

Figure 2 (V) image

As the colorbars take the description of the variables (long_name), one would think (as I did) that U and V are the zonal and meridional components of the velocity, respectively; but by simple inspection of the loop current, it is clear that this is not the case,

I found it by calculating both the vorticity and divergence for regions within faces 1, 2, 10 and 11. For 1 and 2, it worked as expected; whereas, for faces 10 and 11, divergence looked like vorticity and vice-versa; so I made the transformation I already mentioned, and I managed to fix it.

I assume this comes from the ECCO data portal itself, in which case renaming the variable descriptions would suffice. Otherwise, is there a way to fix it, or am I missing something?

Thanks!

timothyas commented 4 years ago

Hey @antonimmo! Welcome to the wonderful world of the LLC grid! Yes, it is a bit nonintuitive at first, hopefully I can help. The LLC grid "faces" or "tiles" are oriented as shown in the first diagram on this page. (Note that the ECCO and LLC community uses the term "tile" for the same thing face means in xmitgcm). Basically, the U and V components on the model's native LLC grid point in the x and y directions local to the face they are on. If you look at face 10 as you have plotted, it is rotated in such a way the U (or UVEL, UVELMASS, or really any "eastward" model component) is pointing Southward, and V (or VVEL, VVELMASS, etc) points Eastward (as you have pointed out).

For certain latitudes, the grid is a regular Lat/Lon projection, and so the uvel and vvel components can be obtained by simply rotating the vectors accordingly. I think that this can be accomplished by grabbing the 'latlon' version with llcreader, using the type="latlon" option with get_dataset.

However, if you want anything that goes poleward of 57N or 70S then the orientation of each grid cell is not trivial, and requires the AngleCS (or just CS) and AngleSN (SN) fields to determine the true orientation. We're still working on making these fields accessible to the user through the ECCO Data Portal in #197, so stay tuned... But, once you have these, computing the true U and V components can be accomplished with the ecco_v4_py package, specifically the vector_calc.UEVNfromUXVY function. You can see an example of this with the LLC90 grid in a not-super-pretty example here.

Finally, if you are interested in simply viewing the velocity fields, that is probably enough. But, if you are interested in computing transport quantities, such as MHT, then you will not want to use that function (i.e. interpolate) first because of numerical errors. You can see a little tutorial for using the package to compute Atlantic MHT here, as well as some examples in the previous link.

Ok as if this wasn't long enough... I have to make the disclaimer that the package I'm talking about has been designed and mostly used by folks working with LLC90. There is no reason why it shouldn't work with LLC4320 (although it will be much slower!), but if you do use it and run into problems I'm happy to help, just raise an issue over there. Finally, the pip version of that package is ... out of date ... so please clone it from github for now (sorry)... and if you want to read more about the LLC grid see Section 2 of https://doi.org/10.5194/gmd-8-3071-2015.

antonimmo commented 4 years ago

Thanks, @timothyas for your thorough answer!

Yes, I've also learnt about LLC tiles from that exact diagram. I had already rotated the data with numpy.rot90, but I wasn't aware that it also applied for the components themselves.

Also, it would be great if llcreader could read the grid as well, so that lon, lat and vertical levels could be promoted to Dataset coordinates. I'm currently addressing that issue my own way, by downloading the grid (except for hFac_.data files since they are too big), loading it with xmitgcm.open_mdsdataset(gridPath, read_grid=True, iters=None, default_dtype=np.float32, geometry="llc"), finding the i,j indices manually and rotating accordingly. By the way, when using that method I don't need to make additional considerations for XG, YG, YC, YG positions as I have to do for U,V components, perhaps because xmitgcm.open_mdsdataset works with a different logic, who knows.

So, just to close this... Should the description of the variables (long_name) be changed so it is clear for non-LLC experts (like myself) that these fields are not the meridional/zonal components, but from LLC's native grid?

Also, does the same apply for other vector fields such as oceTAUX and oceTAUY?

antonimmo commented 4 years ago

Oh, I almost missed it...

It seems to me that using type="latlon" could lead to more memory/network usage, in particular when one just needs a small area. That's why we're using faces in the first place. Does this make sense, or does llcreader downloads the full fields and then split them into faces when required?

Also, does the same apply for k levels? I mean, does it make sense to just subset a particular vertical level to save memory and network usage? Did it misinterpret the documentation in this regard? Now I'm unsure since I don't see any k or face index in the compressed files from ECCO portal, as it happens for iter in {varname}.{iter}.data.shrunk. Maybe I should ask this in a different issue.

Thanks again!

timothyas commented 4 years ago

Hey @antonimmo, I'll try to respond where I can...

Also, it would be great if llcreader could read the grid as well, so that lon, lat and vertical levels could be promoted to Dataset coordinates.

I could not agree more... That's what we're working on with #197, which should be available soon. We're simply waiting to see if we can make more grid variables available on the ECCO Portal, hopefully a permanent place.

By the way, when using that method I don't need to make additional considerations for XG, YG, YC, YG positions as I have to do for U,V components, perhaps because xmitgcm.open_mdsdataset works with a different logic, who knows.

Perhaps I'm misinterpreting this, so let me know if so. If you're referring to that fact that you don't have to perform any rotations for those fields, it is because they are scalar quantities at a single point rather than a vector field. Because U and V have a direction associated with them, the values themselves require any rotation associated with that face or tile. However, something like temperature or depth is defined by a quantity at a given location, so no rotation is needed. See for instance the SST and depth plots shown here in the xgcm docs. With these fields, once the "view" of the data is rotated (i.e. by rotating the tile), nothing else needs to change. This isn't the case with vector valued quantities (or even the grid spacings, e.g. dxC, dyC). Hopefully that helps...

Should the description of the variables (long_name) be changed so it is clear for non-LLC experts (like myself) that these fields are not the meridional/zonal components, but from LLC's native grid?

Yeah I think this could be more clear, but I have to say I'm not sure of the best way to broadcast that info. Maybe it could be appended by llcreader itself to say something like "Velocity in local X direction" for U, or something like that?

Does this make sense, or does llcreader downloads the full fields and then split them into faces when required?

Actually, I'm not sure if it matters whether you call for "faces" and then subset the data with .sel(face=10) or call for "latlon" and then subset with .sel(i=slice( ... ) ) (noting that in the latter case, all of the rotations for the latlon portion of the grid are being added to the task graph as well). But I would be interested to find out!

I mean, does it make sense to just subset a particular vertical level to save memory and network usage?

Yes it does, if you only need a single or a subset of the vertical levels then definitely specify that in your call to get_dataset. My understanding is that even though there are not separate files for depth, by subsetting in this way you are making your "view" of the data smaller, and creating fewer tasks later down the line. Perhaps someone more knowledgable can give a better understanding though...

antonimmo commented 4 years ago

I've learnt a lot in the last 12 hours, so thanks, @timothyas!

Although I have many more questions, they are unrelated to this issue, so maybe later. But one thing bugs me...

noting that in the latter case, all of the rotations for the latlon portion of the grid are being added to the task graph as well

Is there a way to get a graphic representation of such graph? I mean, like in Tensorflow. Perhaps it could get some users a better way to understand, and thus optimize, the graph for what they need.

Now, regarding this issue:

Perhaps I'm misinterpreting this, so let me know if so. If you're referring to that fact that you don't have to perform any rotations for those fields, it is because they are scalar quantities at a single point rather than a vector field (...)

To be clear, there are 2 kind of rotations that could be performed. The first one that applies to all fields (including positions and grid spacings dx, dy, etc.), in my case I use numpy.rot90 to rotate the matrix itself, and comes from the nature of the LLC faces. The second one is what brought me here, and arises from the ECCO-v4 documentation

The Cartesian (x,y) coordinates of llc tiles do not coorespond to longitude and latitude.

which implies that XG/XC are not the longitude, nor YG/YC the latitude, from which I infer that also DXc/DXg are not the grid spacings in the X (zonal) direction, and similarly DYc/DYg are not the grid spaings in the Y (meridional) direction. But I'm reading the grid with xmitgcm.open_mdsdataset(..., geometry="llc") (that I assume is equivalent to type="faces" in get_dataset, also considering that the open_mdsdataset sets swap_dims to False in this case), and XG/XC and YG/YC values don't need to be interchanged. I mean, if for face=10 I read the (min,max) values of XG, I get (-128.0,-38.00146) which is exactly the longitude range for that face, so in this case the XG variable is the physical X coordinate, not the local/LLC X position as the documentation says (to my understanding). However, as I said in my previous comment, it might be because open_mdsdataset works differently, but this contributed to my initial confusion.

Yeah I think this could be more clear, but I have to say I'm not sure of the best way to broadcast that info. Maybe it could be appended by llcreader itself to say something like "Velocity in local X direction" for U, or something like that?

Well, that's what I was suggesting at the beginning, because saying "meridional" and "zonal" isn't accurate ~50% of the globe when we call for faces. Now I'd go a bit further, first by raising a warning when reading vector quantities with type="faces", or when there's potential "confusion" since llcreader can be used also by folks who are probably exploring LLC data and thus not familiar with it; also by changing the description, maybe "Velocity in local X direction", "Velocity in LLC local X direction", or "Velocity in model local X direction" might work, I'm unsure about the right wording.

rabernat commented 4 years ago

The central design principle in xmitgcm is to pass the MITgcm output through to the user in a way that is faithful to MITgcm's internals, as simply as possible. We don't want to introduce new assumptions or conventions, since these then become our responsibility to maintain. They also become an additional source of confusion and error.

The long_name attributes of UVEL and VVEL, for example, were taken directly from the MITgcm source code:

https://github.com/MITgcm/MITgcm/blob/04904a066e85341574a85aef72438b5a212dc4f3/pkg/diagnostics/diagnostics_main_init.F#L201-L215

These are the official descriptions assigned to those diagnostics by MITgcm. xmitgcm does not want to get into the business of editing those descriptions. Your points about these names being confusing for curvilinear grids (and especially for the LLC grid) are quite valid. However, they need to be taken up in the main MITgcm issue tracker. Once the names are changed there, we would be happy to change them here.

Many of your other questions about geometry can be resolved by carefully reading the MITgcm documentation on the horizontal grid discretization: https://mitgcm.readthedocs.io/en/latest/algorithm/horiz-grid.html#curvilinear-coordinates

Or, when that fails, the source code itself: https://github.com/MITgcm/MITgcm/blob/04904a066e85341574a85aef72438b5a212dc4f3/model/inc/GRID.h

Is there a way to get a graphic representation of such graph?

https://docs.dask.org/en/latest/graphviz.html

edoddridge commented 4 years ago

Just jumping in here to say that we have an issue open at the moment about this - https://github.com/MITgcm/MITgcm/issues/248 It would be great if you jumped over there and left a comment.

timothyas commented 4 years ago

I'm glad this is helpful @antonimmo! I can tell you this issue gave myself and many others confusion - you're not alone! I just want to add one more thing to your quote of the ECCOv4 documentation that you mentioned:

The Cartesian (x,y) coordinates of llc tiles do not coorespond to longitude and latitude.

This wording in the documentation is perhaps confusing. The "local x and y direction" is referring to the x and y axis on the plot shown below, not XC/YC or XG/YG. In xmitgcm terms, these "local x/y directions" are the i and j dimensions (for grid cell center locations and i_g,j_g for grid cell edges). For curvilinear or spherical polar grids in the MITgcm (and maybe others, but not for cartesian grids), XC/YC (or XG/YG ... etc) still refer to longitude/latitude, even with the seemingly strange LLC tiling. This is just as you have inferred. When you load these variables using open_mdsdataset, they are functions of i and j, which is the same if variables are accessed via llcreader too.

Note for instance how latitude (YC) changes with the i and j variables differently on different tiles...

Finally, note that the distances dxC/G and dyC/G are different in this regard, and are the distances between the i and j dimension cell centers/edges.

(and thanks @rabernat & @edoddridge!)

antonimmo commented 4 years ago

@rabernat I will take a look at the documentation links you provided. Also the graphviz thing is exactly how I imagined it. There's a lot more to learn, so thanks!

@edoddridge @timothyas So it seems I'm not alone in this!

So, summing up, we could close this issue when MITgcm/MITgcm#248 is resolved.