ioos / APIRUS

API for Regular, Unstructured and Staggered model output (or API R US)
Creative Commons Zero v1.0 Universal
2 stars 1 forks source link

Let's get rid of lon/lat pairing #16

Open rsignell-usgs opened 8 years ago

rsignell-usgs commented 8 years ago

As discussed here, let's get rid of lon,lat pairing -- the practice of stuffing both lon,lat into a single variable. There doesn't seem to be any advantage to doing this, and it always worries me having to know the order. Is it (lon,lat), or (lat,lon)? The implicit order of lon,lat in bounding box requests has caused endless headaches for OGC. Let's avoid that here.

ocefpaf commented 8 years ago

@ayan-usgs is OK with that in pysgrid.

@ChrisBarker-NOAA what do you have to say about pyugrid?

@kwilcox how much code would that change break in your life today?

ChrisBarker-NOAA commented 8 years ago

ouch!

Maybe it's just my personal taste, but I really think they belong together -- there is a performance advantage -- you almost always need the two together, and this keeps them close to each-other in memory. It's often also helpful for computing transformations, etc. Someone posted the example:

lon_data = grid_cell_centers[..., 0][sg_lon.center_slicing] lat_data = grid_cell_centers[..., 1][sg_lat.center_slicing]

which does kind of suck -- though I think it's the plotting API that's broken here :-)

But the reality is that that's how the common plotting APIs do it, and CF breaks them apart as well, so for consistency's sake, it may make sense to do it here as well.

I'll take a look at the code and see how much of a pain it would be to re-factor that, though I can say I do like that the nodes, are, well, u_grid.nodes, rather than, what?

u_grid.nodes_lon
u_grid.nodes_lat

??

Given that numpy slices are references, I suppose we could have both :-)

ChrisBarker-NOAA commented 8 years ago

Thinking a bit more, maybe we can have both nice names access and a tight, single array. I thought about structured dtypes and.or recarrays, but then you get thed named attributes:

nodes.lat
nodes.lon

But then when you ask for one: nodes[n] you get a tuple, not a proper array, and you can't really do the full array-broadcasting math on it. so another option is a little ndarray subclass that adds attributes. then you can have plain old: nodes as a Nx2 array, and also get node.lat and nodes.lon.

I whipped up a little test of this here:

https://github.com/pyugrid/pyugrid/blob/master/test_bed/lat_lon_class.py

This is probably too much magic, and too non-standard, but I kind of like it :-)

-CHB

rsignell-usgs commented 8 years ago

Another nail in the coffin, IMHO: https://github.com/sgrid/pysgrid/issues/69#issuecomment-152496050

hetland commented 8 years ago

I don't think this is too much magic -- in fact I think this is the way python is supposed to work. @property decorators are kind of great.

The main issue, that we have not yet resolved I think, is whether we are thinking about the grid belonging to a variable (e.g., salt.lat) or existing independently. I often like to work with a grid independent of variables, and I think it is sort of silly to load in a variable just to get at the grid. That said, having the grid attached to a variable makes a ton of sense when working with those variables.

I would vote to have cake AND eat cake. That we should be able to get at grids both ways.. Chris' fancy fix is perfect for independent grids, and I like the interface. Also, in general we don't want to deal with stuff like salt.nodes[:, 1], so that sort of logic should carry over to the variables.

On Fri, Oct 30, 2015 at 6:11 AM, Rich Signell notifications@github.com wrote:

Another nail in the coffin: sgrid/pysgrid#69 (comment) https://github.com/sgrid/pysgrid/issues/69#issuecomment-152496050

— Reply to this email directly or view it on GitHub https://github.com/ioos/APIRUS/issues/16#issuecomment-152497867.

Prof. Rob Hetland Texas A&M Univ. – Dept. of Oceanography http://pong.tamu.edu/~rob

ChrisBarker-NOAA commented 8 years ago

@hetland: the trick is that if we want to be able to work with both lat-lon pairs as a single object AND as separate arrays, we can only do that one-way:

My test used a Nx2 array to store the lon-lat pairs, and then gave you access to the individual 1-d arrays.

But if we kept the arrays as individual arrays, there is no way to make them act like a 2-d array without copying.

Unless maybe Biggus or Dask has a way to do that -- @ocefpaf: do you know if that's an option?

ChrisBarker-NOAA commented 8 years ago

As for the other eating cake option:

The main issue, that we have not yet resolved I think, is whether we are thinking about the grid belonging to a variable (e.g., salt.lat) or existing independently. I often like to work with a grid independent of variables, and I think it is sort of silly to load in a variable just to get at the grid.

absolutely -- and I don't think there is a conflict here -- the grid object should certainly be a independent thing one can work with directly.

That said, having the grid attached to a variable makes a ton of sense when working with those variables.

exactly -- so we probably do want an API where a user can load up a variable, and get a grid attached to it, without needing to work with the grid API at all -- but the variable object will be working with it.

Also, in general we don't want to deal with stuff like salt.nodes[:, 1], so that sort of logic should carry over to the variables.

I'm thinking that we don't want to deal with salt.nodes (or slat.lat and salt.lon) at all. rather the variabel interface lets you work in lon-lat space directly -- without directly referencing anything about the underlying grid.

i.e you'd never want to know what the coordintes of the 10th node are when working with salt -- you'd want to know what the salinity is at some lon-lat location -- and you may want to specify the interpolation method, etc.

you may also want to plot up the data on teh native grid, but again, not by grabbing the lot-lat of the nodes, and then plotting....