ARM-DOE / pyart

The Python-ARM Radar Toolkit. A data model driven interactive toolkit for working with weather radar data.
https://arm-doe.github.io/pyart/
Other
513 stars 266 forks source link

nexrad_archive read failure, sometimes #756

Closed nguy closed 6 years ago

nguy commented 6 years ago

I've noticed recently that the pyart.io.read_nexrad_archive fails on some files. Below is the error message. Note that an example file for the failure is shown and could be obtained from AWS bucket.

In [2]: r=pyart.io.read('KLBB20180727_160528_V06')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-2-59839d684d37> in <module>()
----> 1 r=pyart.io.read('KLBB20180727_160528_V06')

~/anaconda3/envs/forward/lib/python3.6/site-packages/pyart/io/auto_read.py in read(filename, use_rsl, **kwargs)
    121             return read_cfradial(filename, **kwargs)    # CF/Radial
    122     if filetype == 'WSR88D':
--> 123         return read_nexrad_archive(filename, **kwargs)
    124     if filetype == 'CHL':
    125         return read_chl(filename, **kwargs)

~/anaconda3/envs/forward/lib/python3.6/site-packages/pyart/io/nexrad_archive.py in read_nexrad_archive(filename, field_names, additional_metadata, file_field_names, exclude_fields, delay_field_loading, station, scans, linear_interp, **kwargs)
    178     azimuth['data'] = nfile.get_azimuth_angles(scans)
    179     elevation['data'] = nfile.get_elevation_angles(scans).astype('float32')
--> 180     fixed_angle['data'] = nfile.get_target_angles(scans)
    181 
    182     # fields

~/anaconda3/envs/forward/lib/python3.6/site-packages/pyart/io/nexrad_level2.py in get_target_angles(self, scans)
    427             scans = range(self.nscans)
    428         if self._msg_type == '31':
--> 429             cut_parameters = self.vcp['cut_parameters']
    430             scale = 360. / 65536.
    431             return np.array([cut_parameters[i]['elevation_angle'] * scale

TypeError: 'NoneType' object is not subscriptable

I also tried to read this through Metpy and received the following error

In [4]: from metpy.io import Level2File

In [5]: f=Level2File('KLBB20180727_160528_V06')
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-5-4151444b7060> in <module>()
----> 1 f=Level2File('KLBB20180727_160528_V06')

~/anaconda3/envs/forward/lib/python3.6/site-packages/metpy/io/nexrad.py in __init__(self, filename)
    192 
    193         # Now we're all initialized, we can proceed with reading in data
--> 194         self._read_data()
    195 
    196     vol_hdr_fmt = NamedStruct([('version', '9s'), ('vol_num', '3s'),

~/anaconda3/envs/forward/lib/python3.6/site-packages/metpy/io/nexrad.py in _read_data(self)
    254                 decoder = '_decode_msg{:d}'.format(msg_hdr.msg_type)
    255                 if hasattr(self, decoder):
--> 256                     getattr(self, decoder)(msg_hdr)
    257                 else:
    258                     log.warning('Unknown message: %d', msg_hdr.msg_type)

~/anaconda3/envs/forward/lib/python3.6/site-packages/metpy/io/nexrad.py in _decode_msg2(self, msg_hdr)
    376     def _decode_msg2(self, msg_hdr):
    377         self.rda_status.append(self._buffer.read_struct(self.msg2_fmt))
--> 378         self._check_size(msg_hdr, self.msg2_fmt.size)
    379 
    380     def _decode_msg3(self, msg_hdr):

~/anaconda3/envs/forward/lib/python3.6/site-packages/metpy/io/nexrad.py in _check_size(self, msg_hdr, size)
    644         hdr_size = msg_hdr.size_hw * 2 - self.msg_hdr_fmt.size
    645         assert size == hdr_size, ('Message type {} should be {} bytes '
--> 646                                   'but got {}'.format(msg_hdr.msg_type, size, hdr_size))
    647 
    648 

AssertionError: Message type 2 should be 80 bytes but got 120

There is a comment in that code that notes an expansion in the header size for RPG build 18. Not sure if this relevant for this error. @dopplershift do you have any insight into this? I didn't want to post in both repositories just yet.

dopplershift commented 6 years ago

@nguy The ROC decided to add some spare bytes to the status message in RDA Build 18.0. At least in MetPy's case, this broke some overly aggressive sanity checking. Current MetPy master has a fix. See Unidata/MetPy#891.

scollis commented 6 years ago

I am going to Assign @rcjackson but if anyone has some spare time to pull across changes from MetPy it would be VERY MUCH appreciated and I have put a lot on Bobby's plate recently and he is on vacation for a few days.

scollis commented 6 years ago

@nguy is it all files of just some files?

nguy commented 6 years ago

@scollis Not all files, in fact it is a reasonably small number that have this problem.

dopplershift commented 6 years ago

To be clear, the MetPy issue seems to be on almost all files (since it looks like most sites are running RDA 18.0). But I'm not convinced the PyART and MetPy problems are related.

nguy commented 6 years ago

Definitely looks like slightly different approaches under the hood for metpy and pyart. But I didn't spend a lot of time scouring the code.

scollis commented 6 years ago

Cool.. I am in a workshop today.. I'll grab the file and see if I can have a play

scollis commented 6 years ago

Deconstructing some of @jjhelmus 's great code the issue comes in init

    def __init__(self, filename):
        """ initalize the object. """
        # read in the volume header and compression_record
        if hasattr(filename, 'read'):
            fh = filename
        else:
            fh = open(filename, 'rb')
        size = _structure_size(VOLUME_HEADER)
        self.volume_header = _unpack_structure(fh.read(size), VOLUME_HEADER)
        compression_record = fh.read(COMPRESSION_RECORD_SIZE)

        # read the records in the file, decompressing as needed
        compression_slice = slice(CONTROL_WORD_SIZE, CONTROL_WORD_SIZE + 2)
        compression_or_ctm_info = compression_record[compression_slice]
        if compression_or_ctm_info == b'BZ':
            buf = _decompress_records(fh)
        # The 12-byte compression record previously held the Channel Terminal
        # Manager (CTM) information. Bytes 4 through 6 contain the size of the
        # record (2432) as a big endian unsigned short, which is encoded as
        # b'\t\x80' == struct.pack('>H', 2432).
        # Newer files zero out this section.
        elif compression_or_ctm_info in (b'\x00\x00', b'\t\x80'):
            buf = fh.read()
        else:
            raise IOError('unknown compression record')
        self._fh = fh

        # read the records from the buffer
        self._records = []
        buf_length = len(buf)
        pos = 0
        while pos < buf_length:
            pos, dic = _get_record_from_buf(buf, pos)
            self._records.append(dic)

        # pull out radial records (1 or 31) which contain the moment data.
        self.radial_records = [r for r in self._records
                               if r['header']['type'] == 31]
        self._msg_type = '31'
        if len(self.radial_records) == 0:
            self.radial_records = [r for r in self._records
                                   if r['header']['type'] == 1]
            self._msg_type = '1'
        if len(self.radial_records) == 0:
            raise ValueError('No MSG31 records found, cannot read file')
        elev_nums = np.array([m['msg_header']['elevation_number']
                              for m in self.radial_records])
        self.scan_msgs = [np.where(elev_nums == i + 1)[0]
                          for i in range(elev_nums.max())]
        self.nscans = len(self.scan_msgs)

        # pull out the vcp record
        msg_5 = [r for r in self._records if r['header']['type'] == 5]
        if len(msg_5):
            self.vcp = msg_5[0]
        else:
            self.vcp = None
        return

You can see

 msg_5 = [r for r in self._records if r['header']['type'] == 5]
        if len(msg_5):
            self.vcp = msg_5[0]
        else:
            self.vcp = None
        return

so for @nguy 's case we are not finding any type 5 headers.

scollis commented 6 years ago

Just putting in some more code in here from the module: Length/definitions:

# NEXRAD Level II file structures and sizes
# The deails on these structures are documented in:
# "Interface Control Document for the Achive II/User" RPG Build 12.0
# Document Number 2620010E
# and
# "Interface Control Document for the RDA/RPG" Open Build 13.0
# Document Number 2620002M
# Tables and page number refer to those in the second document unless
# otherwise noted.
RECORD_SIZE = 2432
COMPRESSION_RECORD_SIZE = 12
CONTROL_WORD_SIZE = 4

# format of structure elements
# section 3.2.1, page 3-2
CODE1 = 'B'
CODE2 = 'H'
INT1 = 'B'
INT2 = 'H'
INT4 = 'I'
REAL4 = 'f'
REAL8 = 'd'
SINT1 = 'b'
SINT2 = 'h'
SINT4 = 'i'

And the specific structure for message 5


# Table XI Volume Coverage Pattern Data (Message Type 5 & 7)
# pages 3-51 to 3-54
MSG_5 = (
    ('msg_size', INT2),
    ('pattern_type', CODE2),
    ('pattern_number', INT2),
    ('num_cuts', INT2),
    ('clutter_map_group', INT2),
    ('doppler_vel_res', CODE1),     # 2: 0.5 degrees, 4: 1.0 degrees
    ('pulse_width', CODE1),         # 2: short, 4: long
    ('spare', '10s')                # halfwords 7-11 (10 bytes, 5 halfwords)
)
scollis commented 6 years ago

@dopplershift : So is it the spare bites in MSG_5? ie from your PR: extra_size = 40 if self.rda_status[-1].rda_build >= '18.0' We have 10bytes

scollis commented 6 years ago

Ok.. @nguy That was not it.. This file is very strange.. I made Py-ART to print out all the message numbers in that file and they are all MSG31

@dopplershift can you try reading KLBB at datetime(2018,7,27,16,5) with MetPy?

scollis commented 6 years ago

@nguy I looked at other files nearby in time and have no issues reading them. (ie at datetime(2018,7,27,15,5))

I am thinking this file is missing any metadata and only has radial data and is therefore corrupt

scollis commented 6 years ago

Once I hear back I might (shock horror) add a more appropriate error message if no MSG_5 buffers are found.

nguy commented 6 years ago

Thanks for looking at this also @scollis , I drilled down to the same place you did in terms of where the code failure occurred, but did not print out the messages. This is happening on more than just this radar though, which is kind of interesting. I'll try to go back and look if they share the same build version.

dopplershift commented 6 years ago

I don't have any problem with MetPy, but I can confirm that MetPy does not see any type-5 messages in that file. We just don't require them.

scollis commented 6 years ago

Perhaps I will just hard code a scan mode

-sent from a mobile device-

On Aug 2, 2018, at 3:47 PM, Ryan May notifications@github.com wrote:

I don't have any problem with MetPy, but I can confirm that MetPy does not see any type-5 messages in that file. We just don't require them.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

scollis commented 6 years ago

Do you guess BCR from radial messages?

-sent from a mobile device-

On Aug 2, 2018, at 3:47 PM, Ryan May notifications@github.com wrote:

I don't have any problem with MetPy, but I can confirm that MetPy does not see any type-5 messages in that file. We just don't require them.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

dopplershift commented 6 years ago

BCR?

We have a incredibly primitive data model at the moment for NEXRAD, so we don’t need to force anything. I’m curious what it is that you need from the type-5 messages.

scollis commented 6 years ago

Well looking at the error its the "Target Angle". ie the angle programed into the VCP, an array of len(sweeps)

scollis commented 6 years ago

Oh.. BCR=Error_correct(VCP)

dopplershift commented 6 years ago

If the vcp information is missing, you could always just average the elevation angles in the sweep and round it.

scollis commented 6 years ago

image Ok! I have a work around! I will submit a PR soon @nguy

scollis commented 6 years ago

Warning.. I set all the fixed angles to zero

scollis commented 6 years ago

Submitted #757

scollis commented 6 years ago

For now this will be up to the user. :)

Scott Collis

On Aug 2, 2018 at 4:43 PM, Ryan May notifications@github.com wrote:

If the vcp information is missing, you could always just average the elevation angles in the sweep and round it.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/ARM-DOE/pyart/issues/756#issuecomment-410077073, or mute the thread https://github.com/notifications/unsubscribe-auth/AAyYB9gPFaNr6HD-5pRc9zFGiij9AsDvks5uM3J_gaJpZM4VooCY .

scollis commented 6 years ago

Ok @nguy get the latest version from source. We will be cutting a new release on conda forge in October

757 closes this issue

nguy commented 6 years ago

Okay, great, thanks @scollis