ssec / polar2grid

Tools for reading, remapping, and writing satellite instrument data.
http://www.ssec.wisc.edu/software/polar2grid/
GNU General Public License v3.0
72 stars 34 forks source link

Add option for downsampling to geo2grid #187

Open djhoese opened 5 years ago

djhoese commented 5 years ago

Many times users want the full disk imagery from GOES or Himawari, but they don't want the full resolution. Currently the only way to do this is with nearest neighbor resampling which is time consuming. We should be able to provide an aggregate flag and use the Scene's aggregate method that was added to Satpy.

djhoese commented 4 years ago

Talking with @kathys, she would prefer striding only. I think this could be done as two options given what Satpy supports. Satpy allows you to aggregate with a particular function (min/max/mean/others), but I could add another option to Geo2Grid specifically for "stride".

djhoese commented 4 years ago

I could add a --aggregate <method> <row_pixels> <col_pixels> flag which would allow for: ‘mean’, ‘sum’, ‘min’, ‘max’, ‘median’, ‘argmin’, ‘argmax’, ‘prod’, ‘std’, or ‘var’ based on the satpy function it would use underneath. I can add an additional stride parameter too which would use the slicing/striding syntax provided by Satpy as well.

The row_pixels and col_pixels are how many pixels to aggregate over for each of those dimensions. One problem is that I think this would have to be done after resampling to do what people really want but it depends what @graemely and @kathys think. For example, say I had the default products loaded (C01-C16, true_color, natural_color). We have three different resolutions involved (500m, 1km, 2km). If I say "take the mean of every 2x2 set of pixels", I then end up with 1km, 2km, and 4km products. Does that make more sense than upsampling everything to the highest resolution (500m) and then aggregating every 2x2 pixel block resulting in all products at 1km? Maybe I should ask Jim Nelson what he would expect since he's brought up this feature request before.

djhoese commented 4 years ago

An example:

--aggregate mean 4 4

djhoese commented 4 years ago

New suggestion based on current discussion:

--downsample mean 2000 2000

djhoese commented 4 years ago

Need to also consider future upsampling requests in addition to the existing native resampling functionality.

jpnIII commented 4 months ago

Hi Dave et al. -- This issue came up again today between Tim Schmit and myself. I looked again at the Geo2Grid online documentation, and the capability of (at least) what Tim and I are after is at least partially available already, via the -g MIN and -g MAX commandline options:

6.1. Native Resampling Native resampling is a special type of resampling that keeps input data in its original projection, but replicates or averages data when necessary to make other processing in Geo2Grid easier. Native resampling is the default for all data that is already gridded (ABI, AHI, etc) or when a native grid is specified by the user on the command line (-g MIN). It can also be specified on the command line by using --method native. See the Command Line Arguments section below for more details and the options available.

In general, at least for GOES, we are always dealing with 0.5km, 1.0km and 2.0km imagery in the 16 various bands. If we run geo2grid.sh -g MIN -p cimss_true_color, I believe that ABI bands 1,2,3 get used. The minimum resolution bands (both 1 and 3) have a 1.0km resolution, so I assume that band2 (0.5km) gets a pixel striding factor of 2 applied in both X and Y directions, and you end up with all 3 bands at 1.0km resolution (10848 X 10848). Conversely, if you run -g MAX, you have to make all 3 bands have 0.5km resolution. So band2 is already good to go, and bands 1 and 3 will be converted to 0.5km resolution by applying a pixel duplication factor of 2 in both the X and Y directions. Extending the idea -- If we wanted to have native projection cimss_true_color imagery at 4km resolution, we would have to apply a pixel striding factor of 4 in both the X and Y directions for bands 1 and 3 (1km nominal resolution), and a striding factor of 8 in both directions for band2 (0.5km nominal resolution). If you wanted to have native projection cimss_true_color imagery at 16km resolution, you would need to apply a pixel striding factor of 16 to the input band 1 and 3 L1b imagery, and a striding factor of 32 to the input band 2 L1b imagery, to get all of the input imagery to have 16km resolution. I hope I have described these examples properly -- I find it really easy to get things backward (duplication versus striding, etc.) when working with imagery resolution.

Finally, I would suggest that the MUG Team would be able to greatly assist in this work, as they have a great deal of experience dealing with this type of thing.

Thank you again, Geo2Grid Team!

Sincerely,

Jim

djhoese commented 4 months ago

@jpnIII Sorry, but I'm not sure if there is a specific question or request in your latest comment beyond of course "this feature would be nice". If there is, could you rephrase it for me? Or if you are addressing my request earlier in this comment thread about what you (Jim) would expect in an interface providing this functionality then maybe you could rephrase your comment with the exact command line flag and arguments you'd expect to provide (ex. --downsample <output x resolution> <output y resolution>).

Overall the functionality internally isn't difficult to implement. It exists in Satpy in one or more ways depending on what we or the user wants. In fact an inconvenient way of doing it already exists by defining your own custom grids (areas) with your desired resolution and then specifying --method native -g my_abi_8km_grid --grid-configs my_grids.conf. I think when this issue was originally created the ability for native resolution to be used for this was not possible but that was updated and should work in at least the newest Geo2Grid release if not earlier releases.

The hardest part is defining an interface on the command line that is flexible and understandable (not confusing) to most users at first glance. As you said, it is easy to get things mixed up about what way things need to be modified to get a certain resolution (strided versus replicated).

jpnIII commented 4 months ago

Thank you, Dave. My apologies for not being more clear in trying to express my thoughts.

Tim suggested that I give your idea a go. But I am still somewhat confused about what you are doing when you specify:

 --method native -g my_abi_8km_grid --grid-configs my_grids.conf

1.) Can I specify a resolution other than 1, 4, 8 or 10km? Looking at grids.yaml, it appears that these are the only resolutions currently allowed for grids that are based on the GOES fixed grid projection. I looked at the help for p2g_grid_helper.sh, and I see where I can use the -p option to specify different PROJ.4 map projections. 2.) Are we only allowed to use RadF (i.e., full disk) L1b input files? I think so, for these various Full Disk projections. 3.) What does "ellps: GRS80" mean? 4.) I think that I need to create a yaml projection file before I run geo2grid.sh. However, I am having problems running p2g_grid_helper.sh to generate this file for the goes_east_8km GOES fixed grid projection. For example, the following command fails (I am not sure how to run p2g_grid_helper.sh for this case):

p2g_grid_helper.sh abcd 0 -75 8000 8000 1356 1356 -p "+proj geos +ellps GRS80"

5.) I may have asked you this before, but did you mean to call the projection "geos" rather than "goes"?

As always, Dave -- Thank you for your help!

Sincerely,

Jim

djhoese commented 4 months ago
  1. I meant to suggest that you make your own custom grids YAML file. By creating your own custom grid definitions you could choose whatever size/resolution you wanted to. If Satpy/G2G detects that your grids definition is a whole factor resolution from the data resolutions then native resampling should work. For example, a 4km grid is 8x 500m resolution, 4x 1km resolution, and 2x the 2km resolution bands. Doing a grid that is say 3km per pixel wouldn't work for native resampling of a 2km band as you can't naively replicate the 2km pixels to make 3km.
  2. The grids provided in Geo2Grid are mostly just examples, but yes they are full disk only. But when you customize your own grids you can do whatever you want. The hard part is knowing the extents of the data without analyzing the x/y coordinates in the NetCDF file yourself and doing the necessary math. That's why I said this suggestion/solution is "inconvenient".
  3. "ellps" is short for ellipsoid and GRS80 is a standard ellipsoid definition of the Earth. You don't need to worry about that or change that (see below). The ABI fixed grid is defined on this ellipsoid which is why it is used.
  4. Don't use the helper script for this if you can help it. For full disk data, copy the existing grids and do the necessary math on the number of pixels. So here's the 8km grid: https://github.com/ssec/polar2grid/blob/01eaab8b61e2d0d67e195907c9587643a7ed12e6/polar2grid/grids/grids.yaml#L452-L466. If you copied and pasted that and wanted to make a 16km grid, you would change the name to goes_east_16km and divide the number of pixels under shape to be half the 8km grids size (678x678). And of course if you wanted some other resolution then starting from the 2km grid size and adjusting for the size you want would make the math easier for 4km, 6km, 8km, 10km, 12km, etc. If you never need to resample a 2km band then you could use the 1km size as a starting point and get 3km, 5km, 7km, 9km grids too.
  5. No. The projection is the Geostationary projection, a defined set of operations to project lon/lat degrees to a flat surface. In this case it is referencing a name in the PROJ library so it must be "geos". The "goes" in the grid name is referencing the GOES-R satellite series and the defined projected region for each satellite position (EAST, WEST, TEST, STORE). Granted I'm not up to date on the current plans for GOES-19 and where it will be positioned, but I can only predict the future so much.
djhoese commented 4 months ago

I just realized while re-reading this that in my previous comments I said --method native -g my_abi_8km_grid --grid-configs my_grids.conf, but that config file should be the new YAML based format so something like my_grids.yaml. My point is don't try to copy the YAML grid definitions I've referenced into an old .conf file.

jpnIII commented 4 months ago

Thank you, Dave – Working on this now…

Jim

From: David Hoese @.> Sent: Wednesday, June 26, 2024 2:28 PM To: ssec/polar2grid @.> Cc: James P. Nelson @.>; Mention @.> Subject: Re: [ssec/polar2grid] Add option for downsampling to geo2grid (#187)

I just realized while re-reading this that in my previous comments I said --method native -g my_abi_8km_grid --grid-configs my_grids.conf, but that config file should be the new YAML based format so something like my_grids.yaml. My point is don't try to copy the YAML grid definitions I've referenced into an old .conf file.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/ssec/polar2grid/issues/187*issuecomment-2192477369__;Iw!!Mak6IKo!PbYTn1ITwM2yEfzIHePjQIM8Pu-8XixcrgasalkwXt6bOPe9iF8snokkafOqLDKf3wElStE6n_3s0dYjwfAN6FYyzizJdA$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AARELF54AEZDA7JTVLFD3X3ZJMI5ZAVCNFSM6AAAAABJHTSKKGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJSGQ3TOMZWHE__;!!Mak6IKo!PbYTn1ITwM2yEfzIHePjQIM8Pu-8XixcrgasalkwXt6bOPe9iF8snokkafOqLDKf3wElStE6n_3s0dYjwfAN6FawLjrADA$. You are receiving this because you were mentioned.Message ID: @.**@.>>

jpnIII commented 4 months ago

Hi again, Dave -- Tim and I just had a discussion. We'd like to throw out the following idea and see what you think:

Currently, a Geo2Grid user can run the following command and get a nice .yaml or .conf file:

 p2g_grid_helper.sh eastern_us         -90        40          2000        -2000        1000        1000   > geo2grid_config.yaml

West- North+ m(int) m(int) int int

The p2g_grid_helper.sh command takes the center latitude into account, and based on that determines what kind of map projection to create for the user (Mercator, Lambert-Conformal, Polar-Stereographic). Well, what if the user could add another argument such as "native" after the last 1000 value? Then, the helper script would still generate a yaml file -- but now the yaml file would define a native (i.e., Fixed Grid) projection grid. It would be fine if p2g_grid_helper.sh restricted the user to define the resolution such that only integer types of pixel subsampling or pixel replication were performed, depending on what grid magnification values were specified. Adding this functionality would essentially mimic what McIDAS-X does now, and in our opinion would be a nice enhancement to Geo2Grid.

Opinion(s)???

Thank you, Dave!

djhoese commented 4 months ago

Jim, I'm glad you guys are brainstorming and hopefully we can come up with something. I have some concerns:

This puts the solution in the "wrong" part of Geo2Grid. Defining a new grid for every version of a reduced resolution product the user wants is a workaround that I'm hoping isn't the long term solution. I'd really like a command line option on the main geo2grid.sh. Especially because I'd like the user to control how the pixels are aggregated together (average, min, max, etc), but maybe that's something that native resampling needs to allow in the future. Native resampling currently only does a mean.

Where and when does the "native" option get resolved? You said it would be in the YAML which I assume means something like "native" would show up for the projection in the YAML. Code-wise this is getting a little too deep outside of Geo2Grid's typical control. Those YAML files are a "standard" scheme by the pyresample library which is used by satpy. Pyresample is also maintained by pytroll meaning I have permissions to change it, but still. A big part of these YAML files is that they define fully defined areas. There is the concept of "dynamic" areas where one of resolution, size, or extents of the area are not known until later, but making this work for the projection is a little harder. That said, I think this "native" magic wouldn't be needed if the user had the option to refer to preset projections (ex. "goes_east_abi") known to the helper script. Maybe this is what you were actually suggesting in the first place. Anyway, yes, the user would have to have one grid for east and one for west or whatever other satellite, but this fits in with the existing optional ability to specify a projection to the helper script. The user would then be skipping the "find the existing GOES grids in the builtin YAML, copy/paste, modify" workflow I described to you earlier.

On the other hand, going further into the magic "native" behavior, the helper script would only be using the overall size and resolution specified by the user. The projection is figured out later and the center point is irrelevant if the data's existing (native) projection and location information is used as a starting point. So, why not cut out the middle man/script and just specify --downsample 2000m 2000m for "make the pixels 2km x 2km" or --downsample 8 8 for "combine every 8x8 pixels into one output pixel"? That's basically what my comment from 4 years ago was suggestion: https://github.com/ssec/polar2grid/issues/187#issuecomment-611587809

djhoese commented 4 months ago

Thinking about this more, there are two almost conflicting features here. The first is wanting to make an area (a.k.a. grid) that is the same projection, same extents, but different resolution to the original input data. The second is the ability to downsample input data to a smaller resolution, also known as aggregating. The first is more about the area definition and could involve any pixel resolution you want and you would use various resampling techniques to resample the input data to that area. For any resolution not a factor of the native input data resolution this means more expensive resampling methods. So no "native" resampling, but nearest neighbor instead. The second in my mind when thinking about implementation, requires the integer factor property of the result (ex. 4x reduction of input resolution). I think this because I'd likely be using Satpy's "aggregate" functionality which uses Xarray's "coarsen" (see https://docs.xarray.dev/en/stable/generated/xarray.DataArray.coarsen.html for low-level API docs).

The aggregating through Satpy's aggregate function doesn't have to replace resampling, but can be thought of as an optional extra or alternative step to resampling. It could be done after resampling (or maybe before too). This would look like:

read data -> resample to any grid (not necessarily native) -> aggregate pixels to smaller resolution -> save to output image/file

This would be a little confusing from a user point of view if they requested a grid to be resampled to and then after aggregation the output is no longer that original grids resolution. The major benefit to aggregation is the ability to apply specific aggregating functions like mean, min, max, or any other function we want to code into Geo2Grid (or Satpy). The big question here is: is anyone actually going to use this aggregation functionality? If the answer is no, then maybe the simple interface mentioned above of --downsample <xres> <yres> with optional handling for meters versus degrees versus overall size is good enough as a shorthand for producing a coarser resolution of the native input data's grid.

</brainstorm>

djhoese commented 3 months ago

In an external email thread with Jim and Tim, I've gotten some additional details about what kind of usage they have from McX now. I'll summarize it here for my own future reference and for others who might be reading this. The end result of their workflow is a series of images of one product (ex. C02), one image per time step, at a very reduced resolution and cropped to a specific region of the input data. The way something like this is specified is a magnification level (negative gets you a downsampled image) and a center point and a size of the final image in pixels.

This could likely be implemented in Geo2Grid as a --crop operation followed by a downsample operation, but is not as easy to describe as a center point and size would be. The next most obvious implementation would be functionality in the grid helper script to say "GOES East ABI, this center point, this final size". I'm realizing now that this magnification parameter is an alternative to specifying the extents/crop boundary of the final result, not something new. So in this specific case implementing this functionality makes more sense in the grid helper script. In the case I had in mind, you'd always be preserving the original extents of the data, but just wanted it at a lower resolution to save space and processing time. More stuff for me to think about.

jpnIII commented 3 months ago

Sounds reasonable, Dave – Thank you again for your extensive thoughts on this issue!

Sincerely,

Jim From: David Hoese @.> Sent: Wednesday, July 10, 2024 11:23 AM To: ssec/polar2grid @.> Cc: James P. Nelson @.>; Mention @.> Subject: Re: [ssec/polar2grid] Add option for downsampling to geo2grid (#187)

In an external email thread with Jim and Tim, I've gotten some additional details about what kind of usage they have from McX now. I'll summarize it here for my own future reference and for others who might be reading this. The end result of their workflow is a series of images of one product (ex. C02), one image per time step, at a very reduced resolution and cropped to a specific region of the input data. The way something like this is specified is a magnification level (negative gets you a downsampled image) and a center point and a size of the final image in pixels.

This could likely be implemented in Geo2Grid as a --crop operation followed by a downsample operation, but is not as easy to describe as a center point and size would be. The next most obvious implementation would be functionality in the grid helper script to say "GOES East ABI, this center point, this final size". I'm realizing now that this magnification parameter is an alternative to specifying the extents/crop boundary of the final result, not something new. So in this specific case implementing this functionality makes more sense in the grid helper script. In the case I had in mind, you'd always be preserving the original extents of the data, but just wanted it at a lower resolution to save space and processing time. More stuff for me to think about.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/ssec/polar2grid/issues/187*issuecomment-2220959086__;Iw!!Mak6IKo!KcaihM4NCXuIiYPgIZnXZmjZpDyFbLwvwHTRVzhH3oI8K__J86alKKFVHnVh3rf-ntaRQf50mY2ZGN44pHbDzxkWpoLxNg$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AARELF5Q2IWAMVESUWQJ7OLZLVNVTAVCNFSM6AAAAABJHTSKKGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMRQHE2TSMBYGY__;!!Mak6IKo!KcaihM4NCXuIiYPgIZnXZmjZpDyFbLwvwHTRVzhH3oI8K__J86alKKFVHnVh3rf-ntaRQf50mY2ZGN44pHbDzxndSaJLCQ$. You are receiving this because you were mentioned.Message ID: @.**@.>>

djhoese commented 3 months ago

I was talking with Kathy about this at our last meeting and I think I'd like to do at least these things:

  1. Document ABI projections as PROJ.4 strings for GOES-EAST, WEST, STORE, and TEST (and whereever G19 is going).
  2. Add a --downsample option to the command line to support easily downsampling the data. I'm not exactly sure if this will work how I intend, but I'd like to try to implement it. The main complication is that something easy for the user to specify is not something easy for the program to understand or doesn't apply to all products because of differing resolutions. For example, specifying pixel resolution for the downsampled result might be a nice round "8000m", but that isn't what ABI's pixel resolution actual is if you magnify 4x times (or whatever number of times). The 1km ABI data is actually closer to 1015.9 meters per pixel.
  3. Create an example of a custom grid in YAML (no helper script) on the ABI geostationary projection at a lower resolution and size and resample multiple time steps to it and optionally produce an animation.
jpnIII commented 3 months ago

Thank you, Dave and Kathy – Sounds good!

My condolences on the difficulties of implementing this (e.g., 1015.9 m!). We’ve only had to deal with straight integer math in the McIDAS-X world. I guess the underlying M-X software has saved us from needing to deal with anything more complex than that.

Take care.

Sincerely,

Jim From: David Hoese @.> Sent: Thursday, July 18, 2024 3:30 PM To: ssec/polar2grid @.> Cc: James P. Nelson @.>; Mention @.> Subject: Re: [ssec/polar2grid] Add option for downsampling to geo2grid (#187)

I was talking with Kathy about this at our last meeting and I think I'd like to do at least these things:

  1. Document ABI projections as PROJ.4 strings for GOES-EAST, WEST, STORE, and TEST (and whereever G19 is going).
  2. Add a --downsample option to the command line to support easily downsampling the data. I'm not exactly sure if this will work how I intend, but I'd like to try to implement it. The main complication is that something easy for the user to specify is not something easy for the program to understand or doesn't apply to all products because of differing resolutions. For example, specifying pixel resolution for the downsampled result might be a nice round "8000m", but that isn't what ABI's pixel resolution actual is if you magnify 4x times (or whatever number of times). The 1km ABI data is actually closer to 1015.9 meters per pixel.
  3. Create an example of a custom grid in YAML (no helper script) on the ABI geostationary projection at a lower resolution and size and resample multiple time steps to it and optionally produce an animation.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/ssec/polar2grid/issues/187*issuecomment-2237507018__;Iw!!Mak6IKo!IO0prwLwZGGw_GD_sL5caoEqirj8gZa2OOME7eQLE9rlDUHl5vTRIbixdBGlgb-odU-13K_DopjYQ4f8OfTajg2c6zNUew$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AARELF5CGRVEO3C6C3EF3WLZNAQTRAVCNFSM6AAAAABJHTSKKGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMZXGUYDOMBRHA__;!!Mak6IKo!IO0prwLwZGGw_GD_sL5caoEqirj8gZa2OOME7eQLE9rlDUHl5vTRIbixdBGlgb-odU-13K_DopjYQ4f8OfTajg2vHzwznQ$. You are receiving this because you were mentioned.Message ID: @.**@.>>