pytroll / satpy

Python package for earth-observing satellite data processing
http://satpy.readthedocs.org/en/latest/
GNU General Public License v3.0
1.08k stars 298 forks source link

Geostationary padding results in wrong area definition for AHI mesoscale sectors. #1030

Closed simonrp84 closed 4 years ago

simonrp84 commented 4 years ago

Describe the bug The geostationary padding code that was added in a recent satpy release works well for missing segments of data, but has an undesirable effect with Himawari/AHI rapid scan sectors. I tested this with the Meso sector (which have filenames containing _R30x) but I suspect the problem also occurs with Japan regional sectors (filenames _JP0x). The Meso sector is a 1000x1000km scan sector that is produced every 2.5 minutes (in addition to the 10 minute full disk scan). Padding this sector to match the full disk is undesirable, as there are no missing sectors. Can we add some code that intelligently checks the scan region and pads / doesn't pad as appropriate?

To Reproduce

flist = glob('/my/data/dir/*R303*.DAT')
outf = '/my/output/dir/AHI-MESO.jpg'
scn = Scene(filenames=flist, reader='ahi_hsd')
scn.load(['true_color'])    
scn2 = scn.resample(scn['B01'].area,
                    radius_of_influence=15000,
                    cache_dir=cache_dir)
scn2.save_dataset('true_color', outf, fill_value=0)

Expected behavior An image to be produced whose area definition is a 1000x1000 pixel box, such as:

Area ID: R303
Description: AHI R303 area
Projection ID: geosh8
Projection: {'a': '6378137', 'b': '6356752.3', 'h': '35785863', 'lon_0': '140.7', 'no_defs': 'None', 'proj': 'geos', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 1000
Number of rows: 1000
Area extent: (-2490000.0161, -2150000.0139, -1490000.0096, -1150000.0074)

Actual results An area definition whose area is a 1000x10000 pixel box, such as:

20200105_1900 R303
Area ID: R303
Description: AHI R303 area
Projection ID: geosh8
Projection: {'a': '6378137', 'b': '6356752.3', 'h': '35785863', 'lon_0': '140.7', 'no_defs': 'None', 'proj': 'geos', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'}
Number of columns: 1000
Number of rows: 10000
Area extent: (-2490000.0161, -11150000.0721, -1490000.0096, -1150000.0074)

Adding pad_data=False to the scn.load command fixes the issue. Ideally the reader should detect the scan type and perform padding as appropriate. Enabling padding for full disk and disabling it for the meso sectors (unless explicitly enabled by the user).

tsukada-cs commented 4 years ago

I also faced the same problem with R30x region data. pad_data = False was supported. I expect a smarter implementation.

djhoese commented 4 years ago

I'm working on a fix for this and I had an idea I wanted to run by @pnuu and @simonrp84:

What if expected_segments in the YAML file (for each file type) could be an int (the number of segments we expect) or a string (the filename parameter specifying the number of segments). In the AHI HSD reader case this would be "total_segments". I've got this fixed locally but not sure it is the least ugly solution.

djhoese commented 4 years ago

Other idea is to allow the file type to specify what sectors to ignore for segmenting, but this seems almost uglier.