USDA-ARS-NWRC / weather_forecast_retrieval

Weather forecast retrieval gathers relavant gridded weather forecasts to ingest into physically based models for water supply forecasts
Other
6 stars 1 forks source link

Make a conversion script for GRIB2 to netcdf #5

Closed micahjohnson150 closed 5 years ago

micahjohnson150 commented 5 years ago

We are moving things over to the cloud computing platforms. To do this we need to create object storage of the grib files. Unfortunately the grib files are harder to serve up for the object storage to be used so conversion to netcdf is preferable. Even if the conversion was straight across in memory the amount will be too significant so some data reduction is necessary.

@scotthavens feel free to add or alter info from this.

Things we should do to reduce data:

micahjohnson150 commented 5 years ago

Here are the results from difference all the variables that were casted as different data.

Name Mean STD Max Min

x 0.00000 0.00000 0.00000 0.00000
y 0.00000 0.00000 0.00000 0.00000
time 0.00000 0.00000 0.00000 0.00000
latitude 0.00000 0.00000 0.00000 -0.00000
longitude 0.00000 0.00001 0.00002 -0.00002

longitude_precision_diff

latitude_precision_diff

micahjohnson150 commented 5 years ago

I would call this a success.

Original grib2 file was 119Mb with this script the final file is 28MB, a 91MB savings (~800GB for an entire year!).

scotthavens commented 5 years ago

The reason we are reducing the number of variables is that for snow modeling effort, we do not need a majority of the atmospheric variables. So the 6 or so that we do need will be extracted and put into a netcdf, allowing for a smaller and more manageable file.

Changing of the datatypes in the netcdf does not affect the variables, just some of the dimension variables. For example, latitude and longitude are cast as doubles, something that isn't needed.

scotthavens commented 5 years ago

What branch is this on?

micahjohnson150 commented 5 years ago

Its on grib2nc

https://github.com/USDA-ARS-NWRC/weather_forecast_retrieval/tree/grb2nc