pytroll / tutorial-satpy-half-day

Jupyter Notebook-based tutorial providing an overview of Satpy's features lasting about 4 hours
Other
55 stars 20 forks source link

Dataset has no geotransform set - viirs #5

Closed acgeospatial closed 3 years ago

acgeospatial commented 3 years ago

Hello,

Thanks so much for the awesome tutorial and video on youtube. I've been trying to replicate 03_writing.ipynb with some VIIRS data. The direct links are below for reference.

ftp://ftp-npp.bou.class.noaa.gov/20200923/VIIRS-SDR/VIIRS-Moderate-Bands-SDR-Terrain-Corrected-Geo/NPP/VIIRS-SDR_VIIRS-Moderate-Bands-SDR-Terrain-Corrected-Geo_20200923_60305.tar

ftp://ftp-npp.bou.class.noaa.gov/20200923/VIIRS-SDR/VIIRS-Moderate-Resolution-Band-10-SDR/NPP/VIIRS-SDR_VIIRS-Moderate-Resolution-Band-10-SDR_20200923_60315.tar

extracting to: GMODO_npp_d20200923_t1222430_e1228234_b46154_c20200923162825111020_noac_ops.h5 SVM10_npp_d20200923_t1222430_e1228234_b46154_c20200923162824980857_noac_ops.h5

I've built a venv using conda create -y -n satpy_tutorial python=3.7 satpy rasterio imageio imageio-ffmpeg cartopy geoviews notebook requests tqdm

as per https://github.com/pytroll/tutorial-satpy-half-day/issues/1 and its all good.

I am trying to write out a geotiff from this data.

from satpy import Scene from glob import glob

sdrfiles = glob(...*.h5') scn = Scene(filenames=sdrfiles, reader='viirs_sdr') scn.load(['M10']) scn.save_datasets()

and this will write me out a tiff file but I get this message:

...\site-packages\rasterio__init__.py:229: NotGeoreferencedWarning: Dataset has no geotransform set. The identity matrix may be returned. **kwargs)

I was wondering if you have you got any experience with the VIIRS data with no geotransform set? Have I missed a file? I've tried GMTCO as well as GMODO. Apologies if this isn't the correct place to ask this question

Thanks so much Andrew

djhoese commented 3 years ago

Hi! The problem isn't your data (or at least not entirely), but rather a limitation of the geotiff format. VIIRS data is made up of a swath of pixels. For geolocation we have a "SwathDefinition" object made up of lon/lat coordinates for each pixel. The Geotiff format really only understands gridded data (in Satpy-land this would be represented by an AreaDefinition). The warning you are seeing is happening because Satpy says "I know this data has geolocation information but it can't be represented as a GeoTIFF". Does that make sense?

If you want a georeferenced geotiff with VIIRS data you will need to resample it (see the resampling lesson in this tutorial).

acgeospatial commented 3 years ago

Hi thanks for such a fast response. I think I see what you mean. I've had a look at the resampling but not quite getting there if I am honest. The line for me scn['M10'].attrs['area']

doesn't produce the same output structure as the tutorial (scn['C05'].attrs['area']), ie not getting projection infomation. Instead I get:

Shape: (3072, 3200) Lons: <xarray.DataArray 'array-3c0a9e6f242872d38b901856d62881a2' (y: 3072, x: 3200)> dask.array<where, shape=(3072, 3200), dtype=float32, chunksize=(768, 3200), chunktype=numpy.ndarray> Dimensions without coordinates: y, x Attributes: start_orbit: 46154 end_orbit: 46154 name: m_longitude resolution: 742 file_type: generic_file dataset_groups: ['GMTCO'] file_key: All_Data/{dataset_group}_All/Longitude file_units: degrees_east standard_name: longitude coordinates: ('m_longitude', 'm_latitude') modifiers: () dataset_group: GMTCO units: degrees_east platform_name: Suomi-NPP sensor: viirs rows_per_scan: 16 start_time: 2020-09-23 12:22:43.070705 end_time: 2020-09-23 12:28:23.461279 _satpy_id: DataID(name='m_longitude', resolution=742, modifier... long_name: longitude ancillary_variables: []

and similar for lats

so when I call new_scn = scn.resample(resampler='native')

and then try to write

I get the transformation error again.

I suspect I am doing something obviously wrong, but can't see it.

djhoese commented 3 years ago

Depends what you mean by "output structure". Changes in Satpy or xarray could produce slightly different results and if you are using data from a different source then you might also see different metadata or different amounts of data (shapes of the array).

Double check the resample command you are using. For VIIRS data you don't normally use the native resampler as that will keep the data as a swath (the "native" projection). In the case of ABI data which is already gridded, the native resampler is a good idea and can save on time and processing. I believe the second half of the resampling lesson discusses resampling to a specific AreaDefinition.

acgeospatial commented 3 years ago

sorry I mean I was expecting to see something similar ("output structure") Area ID: GOES-East Description: 2km at nadir Projection ID: abi_fixed_grid Projection: {'ellps': 'GRS80', 'h': '35786023', 'lon_0': '-75', 'no_defs': 'None', 'proj': 'geos', 'sweep': 'x', 'type': 'crs', 'units': 'm', 'x_0': '0', 'y_0': '0'} Number of columns: 2500 Number of rows: 1500 Area extent: (-3627271.2913, 1583173.6575, 1382771.9287, 4589199.5895)

etc

but I see <pyresample.geometry.SwathDefinition at 0x1ef27cdfb48> does that make sense.

I was hoping to not create a specific AreaDefinition, unless I was able to extract the values straight from the data, ie not hard code them in.

the binder.pangeo.io is really great to check, thanks for making that available

djhoese commented 3 years ago

The first output with "Area ID" is the AreaDefinition I was talking about. The second output you showed is the SwathDefinition I'm talking about. The data, by its nature, is not gridded and can't be represented (some people might say there are ways, I'm not one of those people) in a geotiff. The only way to go from SwathDefinition (non-uniform lon/lat coordinates) to an AreaDefinition is by resampling to an AreaDefinition. The "native" resampler is a special case where it does not actually do "real" resampling but instead keeps the data in the same "projection". In this case, that means it will still keep it as a SwathDefinition.

Hopefully that clears something up...maybe. There is the idea of a DynamicAreaDefinition where you provide some of the parameters of the AreaDefinition but the rest get filled in on-the-fly during resampling. This is documented in the pyresample documentation and may be more what you want for the "hoping to not create a specific AreaDefinition" part.

acgeospatial commented 3 years ago

ok great thank you. Do you reckon working with viirs data in satpy, perhaps calculating and creating data similar https://github.com/pytroll/tutorial-satpy-half-day/blob/master/notebooks/02_reading.ipynb is a reasonable approach.

djhoese commented 3 years ago

There is nothing wrong with working with VIIRS data in that way except for when working with the geolocation and what you should expect. The data can be worked with in the same way, but when it comes to plotting or saving to an image you'll need to make sure to use the lon/lat arrays (unless you resample) or just expect that the data won't be nice and straight like it would be on a map. For example,

lons, lats = scn['I04'].attrs['area'].get_lonlats()

will get you the lon/lat arrays if you need them for something like cartopy (note: this is probably a lot of data for cartopy to handle and may make it slow).