HydrologicEngineeringCenter / Vortex

data processing utilities
MIT License
25 stars 7 forks source link

Vortex is importing some precip grids but not others #104

Open cen4abc opened 8 months ago

cen4abc commented 8 months ago

Hello,

I have been successfully using vortex (version 0.11.10-rc.1) to import ECWMF ERA5 reanalysis precip data (in both netcdf and grib format) and applying it in HECRAS rain on grid models with some nice results (big thanks to the vortex developers). I have also successfully used MRMS QPE data. However, vortex does not seem to want to import some other precip datasets that I would like to use in HECRAS, namely:

Vortex is either creating an empty output file or gets stuck during the import process. I am quite new to gridded climate datasets and cannot see anything that might be causing vortex an issue. Do you have any ideas as to why vortex is not handling these two datasets, and how you might process them to make them compatible? Below are screenshots of the parameters that are visible when first loading the datasets into vortex.

ECMWF HRES variables in vortex NCAS variables in vortex

Many thanks, Andy

ps. for info, this is the ERA5 reanalysis data that vortex will work with:

Dimensions:    (longitude: 5, latitude: 5, time: 35064)
Coordinates:
  * longitude  (longitude) float32 38.5 38.75 39.0 39.25 39.5
  * latitude   (latitude) float32 9.5 9.25 9.0 8.75 8.5
  * time       (time) datetime64[ns] 2017-01-01 ... 2020-12-31T23:00:00
Data variables:
    e          (time, latitude, longitude) float32 ...
    tp         (time, latitude, longitude) float32 ...
Attributes:
    Conventions:  CF-1.6
    history:      2022-01-21 20:00:46 GMT by grib_to_netcdf-2.23.0: /opt/ecmw...
tombrauer commented 7 months ago

@cen4abc if you post sample files I can take a look. Despite communal attempts at conventionality (climate and forecasting metadata conventions) no two data distributors ever format their files in the exact same way which means every file from a new source is an adventure!

cen4abc commented 7 months ago

@tombrauer that would be great, thank you. I have placed some sample files here: https://leeds365-my.sharepoint.com/:f:/g/personal/cen4abc_leeds_ac_uk/EhvnQ56Y3vdLtKOTR64rbwQB3ujM44kP799qrQyCnXsFGg?e=1sch7V

It might be better to start with the ECWMF files as these are more widely used / available than the NCAS data.

Thanks again!

tombrauer commented 6 months ago

I investigated the ECMWF file and it is hitting UnsupportedOperationException: Unsupported DRS type = 42. This looks like a known issue in the netcdf-java library that vortex depends on. See discussion here: https://github.com/Unidata/netcdf-java/issues/753. So, this won't be fixed in vortex until the netcdf group implements a solution.

cen4abc commented 6 months ago

Thank you for getting to the botton of this. If you could look at the NCAS dataset at some point as well, that would be much appreciated.

tombrauer commented 6 months ago

@cen4abc I'm looking at the NCAS dataset and it does not appear to follow any convention. Vortex is expecting CF. Passing this dataset to Vortex is kind of like passing a Canadian bill to an ATM programed for U.S. legal tender. Does the originator of the dataset have an explanation for what follow variables mean: image

cen4abc commented 6 months ago

@tombrauer let me talk to NCAS and get back to you. I can guess but not with certainty.

cen4abc commented 6 months ago

Hello again,

All variables are defined as follows (crr intensity is the one we are interested in):

crr - Convective Rainfall Rate Class values from 0 to 11 relating to rain rate bands "00_02_mm_h 02_1_mm_h 1_2_mm_h 2_3_mm_h 3_5_mm_h 5_7_mm_h 7_10_mm_h 10_15_mm_h 15_20_mm_h 20_30_mm_h 30_50_mm_h 50____mm_h"

crr_accum- Convective Hourly Rainfall Accumulation units = "mm" valid range 0-50 mm

crr_conditions- Common geophysical and processing conditions 22 catagories that realte to the catagories: "space night day twilight sunglint land sea coast not_used not_used all_satellite_channels_available useful_satellite_channels_missing mandatory_satellite_channels_missing all_NWP_fields_available useful_NWP_fields_missing mandatory_NWP_fields_missing all_product_data_available useful_product_data_missing mandatory_product_data_missing all_auxiliary_data_available useful_auxiliary_data_missing mandatory_auxiliary_data_missing"

crr_intensity- convective precipitation rate units = "mm/h" valid range 0-50 mm/hr

crr_intensity_pal- RGB palette for crr_intensity The standard colour palette for plotting the data.

crr_pal- RGB palette for crr The standard colour palette for plotting the data.

crr_quality- Common Quality Indicators 7 possible values relating to the following catagories: "nodata internal_consistency temporal_consistency good questionable bad interpolated"

crr_status_flag- Information on specific NWC GEO CRR processing 14 possible values relating to the following catagories: "humidity_correction_applied evolution_correction_applied gradient_correction_applied parallax_correction_applied orographic_correction_applied solar_data_used lightning_data_used data_filtered data_hole_filled all_bands_for_accumulation_available one_band_for_accumulation_missing several_no_consecutive_bands_for_accumulation_missing several_consecutive_bands_for_accumulation_missing accumulation_quality_flag"

tombrauer commented 5 months ago

@cen4abc I just published a build that should read in the NCAS dataset: https://github.com/HydrologicEngineeringCenter/Vortex/releases/download/v0.11.11-rc.2/vortex-0.11.11-rc.2-win-x64.zip

image

The originators of this dataset didn't do anyone any favors by following no particular convention with the dataset. The dataset uses an irregular grid (kind of like a spider web) that must be re-indexed into a cartesian grid. It takes some time to re-index the 1.8 million grid cells but the good news is that if you are importing multiple grids at a time, the re-indexing only has to be done once for the multiple grids.

The dataset does not include a time dimension so there is no conventional way to attribute time. I found a global attribute "nominal_product_time" that gives some notion of time. Inferring time from attributes isn't conventional but I added handling logic to read the attribute if it exists and apply it as both start and end time for the grid.

The variable names "crr_accum" and "NWC GEO CRR Convective Hourly Rainfall Accumulation" are non-standard but I added logic to recognize them as precipitation. Standard name table: https://cfconventions.org/Data/cf-standard-names/current/build/cf-standard-name-table.html

Because there is only a single time attribute to each grid record, you'll have to use the time-shifter to shift either the start date or the end date of the data to create period cumulative precipitation data. Based on the 3 grids provided I'm guessing you'll want to shift the start time by -15 minutes.

Feel free to provide this feedback to the originators of the data. I will quickly run out of time if I add handling logic like this for every non-conventional dataset that's out there. The burden should really be on the originator to clean up their data and make it more transferrable to others.

cen4abc commented 5 months ago

@tombrauer - very appreciative of this, thanks. I will raise this with the dataset provider!

cen4abc commented 5 months ago

@tombrauer I have just sent you a PM about this.