Closed m-mohr closed 2 years ago
I've implemented a solution for this issue (including smaller file size), but I'm not 100% sure whether this is a good solution.
I'm basically running this now instead of cogify
from stactools:
call([
"gdal_calc.py",
"-A", input_path,
"--outfile", output_path,
"--calc", "maximum(A, -1)",
"--NoDataValue", "-1",
"--format", "GTIFF", # COG
"--co", "TILED=YES",
"--co", "COPY_SRC_OVERVIEWS=YES",
"--co", "COMPRESS=LZW"
# "--co", f"TARGET_SRS={reproject_to}"
])
Valid values are >= 0 and -1 and -3 are no-data values in the GRIB files. So what I do is simply limit the value range to >= -1 and set the no-data value to -1 with gdal_calc and them generate the COG using gdal_calc, too. I can't use the COG driver for an unknown reason (see below). Is it okay to handle no-data this way?
The other thing is, right now the process is:
I'm wondering whether accessing the files twice is an issue? And/Or Shall I use the cogify function in stactools?
In theory I should also be able do 2+3 in one step by using the COG format and the TARGET_SRS option in gdal_calc, but then I'm getting
ERROR 6: GDALDriver::Create() ... no create method implemented for this format. 'NoneType' object has no attribute 'SetGeoTransform'
which is weird because COG should be able to create files.
Is it okay to handle no-data this way?
This is a fine approach, but after discussion I think we'd like to split the nodata values into their own "QA/mask" COG as well. So your current approach is fine (though, for float data, can/should we use nan
instead of -1
for nodata?), but we should also create a second COG alongside the first, with the -1
and -3
values where present, and nodata (0
or nan
probably) elsewhere.
In theory I should also be able do 2+3 in one step by using the COG format and the TARGET_SRS option in gdal_calc, but then I'm getting
ERROR 6: GDALDriver::Create() ... no create method implemented for this format. 'NoneType' object has no attribute 'SetGeoTransform'
which is weird because COG should be able to create files.
Agreed, this is strange, but we've seen the COG driver fall down in other areas before. I'll see if I can take a look.
@m-mohr I just looked into this a bit more, and I'm not seeing the same issues. Steps:
gdalinfo -stats MRMS_MultiSensor_QPE_01H_Pass1_00.00_20220607-150000.grib2
Output:
Driver: GRIB/GRIdded Binary (.grb, .grb2)
Files: MRMS_MultiSensor_QPE_01H_Pass1_00.00_20220607-150000.grib2
MRMS_MultiSensor_QPE_01H_Pass1_00.00_20220607-150000.grib2.aux.xml
Size is 7000, 3500
Coordinate System is:
GEOGCRS["Coordinate System imported from GRIB file",
DATUM["unnamed",
ELLIPSOID["Spheroid imported from GRIB file",6378160,298.253916296469,
LENGTHUNIT["metre",1,
ID["EPSG",9001]]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433,
ID["EPSG",9122]]],
CS[ellipsoidal,2],
AXIS["latitude",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433,
ID["EPSG",9122]]],
AXIS["longitude",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433,
ID["EPSG",9122]]]]
Data axis to CRS axis mapping: 2,1
Origin = (-129.999999999857124,54.999999999857103)
Pixel Size = (0.009999999714245,-0.009999999714204)
Corner Coordinates:
Upper Left (-130.0000000, 55.0000000) (130d 0' 0.00"W, 55d 0' 0.00"N)
Lower Left (-130.0000000, 20.0000010) (130d 0' 0.00"W, 20d 0' 0.00"N)
Upper Right ( -60.0000020, 55.0000000) ( 60d 0' 0.01"W, 55d 0' 0.00"N)
Lower Right ( -60.0000020, 20.0000010) ( 60d 0' 0.01"W, 20d 0' 0.00"N)
Center ( -95.0000010, 37.5000005) ( 95d 0' 0.00"W, 37d30' 0.00"N)
Band 1 Block=7000x1 Type=Float64, ColorInterp=Undefined
Description = 0[m] GPML="Specific altitude above mean sea level"
Min=0.000 Max=74.700
Minimum=0.000, Maximum=74.700, Mean=0.081, StdDev=0.816
Metadata:
GRIB_COMMENT=Multi-sensor accumulation 1-hour (1-hour latency) [mm]
GRIB_DISCIPLINE=209
GRIB_ELEMENT=MultiSensor_QPE_01H_Pass1
GRIB_FORECAST_SECONDS=0
GRIB_IDS=CENTER=161(US-OAR) SUBCENTER=0 MASTER_TABLE=255 LOCAL_TABLE=1 SIGNF_REF_TIME=3(Observation_time) REF_TIME=2022-06-07T15:00:00Z PROD_STATUS=2(Research) TYPE=7(Processed_radar_observations)
GRIB_PDS_PDTN=0
GRIB_PDS_TEMPLATE_ASSEMBLED_VALUES=6 30 8 0 97 0 0 0 0 102 0 0 255 1 0
GRIB_PDS_TEMPLATE_NUMBERS=6 30 8 0 97 0 0 0 0 0 0 0 0 102 0 0 0 0 0 255 1 0 0 0 0
GRIB_REF_TIME=1654614000
GRIB_SHORT_NAME=0-GPML
GRIB_UNIT=[mm]
GRIB_VALID_TIME=1654614000
STATISTICS_MAXIMUM=74.700004577637
STATISTICS_MEAN=0.081450768779566
STATISTICS_MINIMUM=0
STATISTICS_STDDEV=0.81550677276342
STATISTICS_VALID_PERCENT=100
There's no nodata
value, but there also isn't anything below 0
. I wonder if we're looking at the same data?
@gadomski Yes, for your example (CONUS) there's no nodata. GUAM has -3 (no coverage) nodata values, for example. The covered area around GUAM has a somewhat circle-ish shape and everything around it is set to -3.
Also, the files never have an actual no-data value reported in the metadata. You can only find these values online in the manual.
GUAM has -3 (no coverage) nodata values, for example. The covered area around GUAM has a somewhat circle-ish shape and everything around it is set to -3.
Hm, I'm not seeing any 2D QPE data for Guam at all, e.g. https://mrms.ncep.noaa.gov/data/2D/GUAM/MultiSensor_QPE_24H_Pass2/. From this chart (provenance unknown), I'm wondering if we can only expect Alaska, Hawaii, and CONUS:
Also, the files never have an actual no-data value reported in the metadata. You can only find these values online in the manual.
Based on communications with NOAA folks, these nodata values may be for the "radar only" products, not the Q3MS (which I think we're using). I'm not entirely sure, their naming schemes seem inconsistent.
@gadomski Weird, I have been downloading data from there some days ago. I was able to download example data for all 5 areas, there is an example for each area in the test data files. Here's the example for GUAM I downloaded recently: https://github.com/stactools-packages/noaa-mrms-qpe/blob/main/tests/GUAM/MRMS_MultiSensor_QPE_01H_Pass1_00.00_20220601-120000.grib2.gz
Also, it doesn't specify a no-data value limitation/constraint in the online docs so I couldn't know. Thus, tha package is written in a way that it expects these -1 and -3 nodata values generally in any product. If there are limitations we can surely adopt the package accordingly, but then this should be documented by them somewhere (publicly).
Yup, I can confirm the presence of -3
values in that file (not -1
but maybe there's just none of those). I think I need to do some more reading to figure out the processing streams for these files. Very strange that there wasn't anything in https://mrms.ncep.noaa.gov/data/2D/GUAM/MultiSensor_QPE_24H_Pass2/ ... if you do find data in that or a similar directory, can you let me know? Thanks.
@gadomski The CARIB files also have no-data values due to being single-sensor. There I also only find -3 right now, but I don't know for which cases -1 can occur.
I'd assume there's some kind of issue with GUAM right now and they will apprear again eventually, they have been there for two weeks (end of May / beginning of June) when I was working on this package.
CARIB
What's this?
Caribbean Islands, in the folder CARIB (instead of GUAM)
Thanks, missed that.
Ok, I've looked into this is a bit more, and unsurprisingly the story is a little mixed. I checked on the latest
data for all of the regions we're interested in for all the intervals and pass numbers, looks like only CARIB
and GUAM
are using negative values for NODATA
. Those values change when you get to the 24 hour product (-3
for the 1 hour, -999
for the 24 hour).
I think the next steps are:
Full table below, and I've got the notebook in a branch here.
Name | Interval | Pass number | Location | Min | Max | Unique negative values | Nodata values | |
---|---|---|---|---|---|---|---|---|
0 | MultiSensor_QPE_01H_Pass1 | 01H | 1 | CONUS | 0 | 95.6 | [] | (None,) |
1 | MultiSensor_QPE_01H_Pass1 | 01H | 1 | ALASKA | 0 | 7.8 | [] | (None,) |
2 | MultiSensor_QPE_01H_Pass1 | 01H | 1 | CARIB | 0 | 18.7 | ['-3.0'] | (None,) |
3 | MultiSensor_QPE_01H_Pass1 | 01H | 1 | GUAM | 0 | 4 | ['-3.0'] | (None,) |
4 | MultiSensor_QPE_01H_Pass1 | 01H | 1 | HAWAII | 0 | 5 | [] | (None,) |
5 | MultiSensor_QPE_01H_Pass2 | 01H | 2 | CONUS | 0 | 44.4 | [] | (None,) |
6 | MultiSensor_QPE_01H_Pass2 | 01H | 2 | ALASKA | 0 | 7.8 | [] | (None,) |
7 | MultiSensor_QPE_01H_Pass2 | 01H | 2 | CARIB | 0 | 18.7 | ['-3.0'] | (None,) |
8 | MultiSensor_QPE_01H_Pass2 | 01H | 2 | GUAM | 0 | 6.6 | ['-3.0'] | (None,) |
9 | MultiSensor_QPE_01H_Pass2 | 01H | 2 | HAWAII | 0 | 5.2 | [] | (None,) |
10 | MultiSensor_QPE_24H_Pass2 | 24H | 2 | CONUS | 0 | 120 | [] | (None,) |
11 | MultiSensor_QPE_24H_Pass2 | 24H | 2 | ALASKA | 0 | 125.7 | [] | (None,) |
12 | MultiSensor_QPE_24H_Pass2 | 24H | 2 | CARIB | 0 | 92.5 | ['-999.0'] | (None,) |
13 | MultiSensor_QPE_24H_Pass2 | 24H | 2 | GUAM | 0 | 22 | ['-999.0'] | (None,) |
14 | MultiSensor_QPE_24H_Pass2 | 24H | 2 | HAWAII | 0 | 40.6 | [] | (None,) |
Per the MRMS table, all data products in the summary table above should be -1 = Missing and -3 = No Coverage. The -999 values do not agree with the reference. Perhaps Tom Augspurger can get some information on this from a NOAA contact? He has mentioned this a few times for VIIRS data.
MS's preference has been to create new COGs with the original nodata values preserved when we encounter datasets with multiple nodata values, so I'd lean in that direction. If we can't get an answer on the -999 value, I would preserve it in the nodata COG and use a class description to note the uncertainty on the meaning.
It's likely a black hole, but perhaps a ping to the email contact on this page?
My 2 cents. 🙂
Interesting, I had not seen the -999 values before. I guess we clarify this through the e-mail conversation you cc'ed me in? @gadomski
Yes, NOAA is looking into the discrepancies from their side.
Decision for now:
Expose no-data values for GRIB through the classification extension. For COGs use the raster extension and normalize no-data values (i.e. all negative values) to -3 (or -1). Don't expose no-data values for the real multi-sensor data, only add no-data values for GUAM and CARIB to the metadata and document clearly that they are not really no-data values.
The last commits have added the following:
See the examples folder for details
Describe the bug When I activate the COG conversion, the no-data values from the GRIB files (-1, -3) are not reflected in the COGs anymore.
Missing value: -1 No coverage: -3 Valid values: >= 0
Expected behavior The main issues here are:
Possible solutions:
I experimented a bit with gdal_edit.py and gdal_calc.py, but did not find a good solution yet.
Example:
gdal_calc.py -A MRMS_MultiSensor_QPE_24H_Pass2_00.00_20220530-120000.grib2 --outfile=cog.tif --calc="A*(A>0)" --NoDataValue=0 --format=COG
This somewhat works, but also eliminates zeros...cc @gadomski