barronh / pseudonetcdf

PseudoNetCDF like NetCDF except for many scientific format backends
GNU Lesser General Public License v3.0
77 stars 35 forks source link

ARL packed bit coordinate dimensions #83

Closed cdholmes closed 4 years ago

cdholmes commented 4 years ago

PseudoNetCDF incorrectly reads the coordinate variables for some ARL packed bit files. Meteorological files with x or y dimension greater than 1000 are affected, but files with smaller dimension are OK.

HRRR meteorological data have a grid dimension of nx=1059, ny=1799. Pnc reads the data variables in their correct sizes, but the x and y coordinate variables are incorrectly read in with sizes nx=59 and ny=799. Demonstration code is below.

I have some ideas about the cause.

  1. Header records in ARL files contain the x, y, z dimensions of the variables as three-digit numbers. For grid sizes larger than 999, the header reports a size of nx-1000 (i.e. for nx=1059, header reports 059) and there’s some kung fu involving another header variable (GRID) that specifies that 1000 offset should be added to the header number. The relevant section appears to be the gridx_off and gridy_off settings in the inqarlpackedbit function.
  2. The kung fu in pnc is working for the 2D and 3D data variables (e.g. UWND, PRSS), but NOT for the coordinate variables x and y. Demonstration below.
  3. It appears that the offsets are offsets are NOT added to the sizes of the x, y coordinate variables. It's not clear to me why.

    DEMO CODE

    Example file from: ftp://arlftp.arlhq.noaa.gov/pub/archives/hrrr/20190701_00-05_hrrr Other met data sources with x or y dimensions greater than 1000 are also affected (e.g. NAM, gfs0p25).

#!/usr/bin/evn python3
import PseudoNetCDF as pnc

infile ='20190701_00-05_hrrr'
print( infile )
f = pnc.pncopen(infile, format='noaafiles.arlpackedbit' )

print('Variables dimensions')
print('x',    f.variables['x'].shape)
print('y',    f.variables['y'].shape)
print('PRSS', f.variables['PRSS'].shape)
print('UWND', f.variables['UWND'].shape)
f.close()

RESULT:

Variables dimensions
x (799,)                   # INCORRECT. This should be 1799
y (59,)                    # INCORRECT. This should be 1059
PRSS (6, 1059, 1799)       # CORRECT
UWND (6, 35, 1059, 1799)   # CORRECT

Environment

I'm using latest pnc version (master branch >3.1.0, accessed 2/20/2020) with anaconda python 3.8 (anaconda version 2019-10).

barronh commented 4 years ago

@cdholmes

Thank you for catching this bug. I've updated the code in the branch bugfix/arlpackedbit-large-dim. You can test it by installing from there directly. Right now, it definitely addresses your testcase above. However, it would be good to know that it fixes your workflow more generally.

After you approve and the code is tested via continuous integration, I will incorporate it into the main branch.

cdholmes commented 4 years ago

Proposed Solution

The class arlpackedbit needs to define nx and ny consistent with the way they are done in inqarlpackedbit.

Current code (lines 661-662) of _arl.py:

        nx = int(self.NX)
        ny = int(self.NY)

Replace those two lines with

        # Add grid offsets, as in inqarlpackedbit
        gridx_off = max(0, (self.GRID[0] - 64) * 1000)
        gridy_off = max(0, (self.GRID[1] - 64) * 1000)
        nx = int(self.NX) + gridx_off
        ny = int(self.NY) + gridy_off

This change solves the problem that I've encountered.

cdholmes commented 4 years ago

Your solution works. Thanks!

cdholmes commented 4 years ago

As I wrote there, it works for me, but I haven’t done much testing apart from converting a file to netCDF.

The bigger problem that I’m dealing with now is that the x and y coordinate values produced by PseudoNetCDF are not correct for ARL files containing meteorology on a Lambert Conformal projections. This affects NAM and HRRR meteorology. Pnc currently lists the x coordinate values as x=0,3,6,9, etc. (units: km). However, the zero position should be in the middle of the model domain, not on the left edge. Presumably there is an offset that I can subtract, but I can’t find documentation of the HRRR grid that would allow me to confirm my work. Ever dealt with this problem?

Incidently, does pnc have a keyword to create a compressed NETCDF4_CLASSIC?

Thanks, Chris

From: barronh notifications@github.com Reply-To: barronh/pseudonetcdf reply@reply.github.com Date: Thursday, February 20, 2020 at 11:44 AM To: barronh/pseudonetcdf pseudonetcdf@noreply.github.com Cc: Christopher Holmes cdholmes@fsu.edu, Mention mention@noreply.github.com Subject: Re: [barronh/pseudonetcdf] ARL packed bit coordinate dimensions (#83)

@cdholmeshttps://urldefense.com/v3/__https:/github.com/cdholmes__;!!PhOWcWs!jTfDb174gfqpFpDIdqvGBW-QMBp249SMITNPrYzsSmPbTf4bxezekkdiWl1dmjo$

Thank you for catching this bug. I've updated the code in the branch bugfix/arlpackedbit-large-dim. You can test it by installing from there directly. Right now, it definitely addresses your testcase above. However, it would be good to know that it fixes your workflow more generally.

After you approve and the code is tested via continuous integration, I will incorporate it into the main branch.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/barronh/pseudonetcdf/issues/83?email_source=notifications&email_token=AAQMLB2NOD3X6UL4PE7FJK3RD2XOXA5CNFSM4KYQ4NT2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMPB6EY*issuecomment-589176595__;Iw!!PhOWcWs!jTfDb174gfqpFpDIdqvGBW-QMBp249SMITNPrYzsSmPbTf4bxezekkdir6AH6Ow$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AAQMLB3SRGBVCXITYLGXKU3RD2XOXANCNFSM4KYQ4NTQ__;!!PhOWcWs!jTfDb174gfqpFpDIdqvGBW-QMBp249SMITNPrYzsSmPbTf4bxezekkdiV2tNCuo$.

barronh commented 4 years ago

@cdholmes

The code below would save as NETCDF4 classic with compression level 1 on all variables.

tmpf = pnc.pncopen(...)
tmpf.save(outpath, format='NETCDF4_CLASSIC', complevel=1)

As for the coordinates... It looks like the projection can be diagnosed from the metadata and a CF meta-data can be added. The grid mapping definition includes false_easting/false_northing so the x/y are currently correct.

I've added this capability and tested it for Lambert Conic Conformal and polar stereographic. Ideally, we'd do the same with mercator.

See branch enhancement/arlpackedbit-projection.

barronh commented 4 years ago

@cdholmes

I think we've followed this one as far as we can with NOAA. Should we close this issue since we have created sufficient options for the user to make their own call?

cdholmes commented 4 years ago

I agree. Pseudonetcdf is correctly reading the arl packed bit files. The remaining issue is that the packed bit files round off the synchlat and synchlon and don't contain the assumed radius of the Earth. Those are upstream issues at NOAA.