Aurora Pipeline Fails when NCEDC is down

kkappler commented 2 years ago

Parkfield tests fail on gitactions runner because the data and metadata cannot be called when NCEDC is suffering an outage.

First observed on 17 Mar, 2022.

Since NCEDC is not a stakeholder at this point, we cannot expect them to be concerned about this issue.

We could:

make a copy of the data on an IRIS hosted server so that these tests can query IRIS instead of NCEDC.
Alternatively, we could revisit mth5_test_data repo, tidy it up and place test data there. Note that make_mth5 is already tested by mth5, i.e. the access / transmission of data and metadata is being tested outside aurora.
- If revisiting mth5_test_data, the Pooch package maybe of interest:

kkappler commented 2 years ago

This happened again on April 23 at 0900 Pacific time.

The error is:

Traceback (most recent call last):
    streams = dataset_config.get_data_via_fdsn_client(data_source="NCEDC")
  File "/home/kkappler/software/irismt/aurora/aurora/sandbox/io_helpers/fdsn_dataset_config.py", line 78, in get_data_via_fdsn_client
    self.endtime,
  File "/home/kkappler/anaconda2/envs/py37/lib/python3.7/site-packages/obspy/clients/fdsn/client.py", line 830, in get_waveforms
    raise ValueError(msg)
ValueError: The current client does not have a dataselect service.

I have attached the hz data from PKD for the time interval that we use for the tests ... ex, ey, hx, hy are already archived at IRIS. hz_pkd.csv

@timronan Can you or Laura look at adding this hz data to the IRIS archive? Then we can set up the tests to use IRIS (or try NCEDC and catch exception use IRIS).

kkappler commented 2 years ago

in tests/parkfield/ calling python make_parkfield_mth5.py creates the mth5 file locallly, with both the data and the metadata (from NCEDC).

This file could actually be used as a source of data and metadata that we could push to IRIS, see issue 99 in mth5: https://github.com/kujaku11/mth5/issues/99

kkappler commented 2 years ago

Here's a new one, Sept 2, 2022: ` from obspy.clients.fdsn import Client

Client(base_url="NCEDC") `

Client(base_url="NCEDC") Traceback (most recent call last): File "/home/kkappler/software/pycharm-community-2019.1.1/plugins/python-ce/helpers/pydev/_pydevd_bundle/pydevd_exec2.py", line 3, in Exec exec(exp, global_vars, local_vars) File "", line 1, in File "/home/kkappler/anaconda2/envs/py38/lib/python3.8/site-packages/obspy/clients/fdsn/client.py", line 276, in init self._discover_services() File "/home/kkappler/anaconda2/envs/py38/lib/python3.8/site-packages/obspy/clients/fdsn/client.py", line 1531, in _discover_services wadl_parser = WADLParser(wadl) File "/home/kkappler/anaconda2/envs/py38/lib/python3.8/site-packages/obspy/clients/fdsn/wadl_parser.py", line 28, in init doc = etree.parse(io.BytesIO(wadl_string)).getroot() File "src/lxml/etree.pyx", line 3536, in lxml.etree.parse File "src/lxml/parser.pxi", line 1893, in lxml.etree._parseDocument File "src/lxml/parser.pxi", line 1913, in lxml.etree._parseMemoryDocument File "src/lxml/parser.pxi", line 1800, in lxml.etree._parseDoc File "src/lxml/parser.pxi", line 1141, in lxml.etree._BaseParser._parseDoc File "src/lxml/parser.pxi", line 615, in lxml.etree._ParserContext._handleParseResultDoc File "src/lxml/parser.pxi", line 725, in lxml.etree._handleParseResult File "src/lxml/parser.pxi", line 654, in lxml.etree._raiseParseError File "", line 1 lxml.etree.XMLSyntaxError: Space required after the Public Identifier, line 1, column 50

kkappler commented 2 years ago

The lxml error is due to NCEDC changing their urls. See Issue 3134 https://github.com/obspy/obspy/issues/3134

kkappler commented 1 year ago

Here is a new one again Dec 2022 Symptoms:

Python 3.6, 3.7 fail due to no inventory returned by NCEDC
Python 3.8, 3.9 Fails in run_ts_obj.from_obspy_stream(streams_dict[station_id], run_metadata) at the end of the method when calling self.validate_metadata() message is: mt_metadata.base.metadata.run.add_channel - ERROR: component cannot be empty

Note the mth5.timeseries.run_ts.RunTS calls self.validate_metadata() twice. The first time through it passes, but not the second.

The first time through is in the set_dataset method of RunTS. There is a check of the condition: self.run_metadata.id not in self.station_metadata.runs.keys() which is False, because self.run_metadata.id = '0' and self.station_metadata.runs.keys() = ['0',], so the self.station_metadata.runs[0].update(self.run_metadata) is ignored.

After set_data() a check is made:

if run_metadata is not None:
            self.run_metadata.update(run_metadata)

This metadata update is what triggers the failure, because after the metadata update: self.run_metadata.id = '001' and self.station_metadata.runs.keys() = ['0',] i.e. the run_metadata.id changed, but the station_metadata.runs.keys did not. Because of this inconsistency, the next time self.validate_metadata() executes, the condition self.run_metadata.id not in self.station_metadata.runs.keys() returns True, which triggers self.station_metadata.runs[0].update(self.run_metadata)

I followed the trail for awhile, and the error occurs when an auxiliary channel is encountered... @kujaku11 do we want to force component on auxiliary channels? Also, we might need to track down why there is an aux channel at all here.

if channel_obj.component is None:
        if not isinstance(channel_obj, Auxiliary):  # Adding this condition seems to fix the 3.8/3.9 issue
                msg = "component cannot be empty"
                self.logger.error(msg)
                raise ValueError(msg)

kkappler commented 1 year ago

Regarding the second flavor of failure, ... this might be related to the obspy version.
Note that obspy v1.2.2 has python2 code in it. A long-awaited python3-only version of obspy (v1.3) was released in 2022, and updated to 1.3.1 in October 2022. This requires python >= 3.7. So we should probably require the same.

Only a month after v1.3.1 was released, out popped v1.4, November 2022. This version requires python>=3.8. It is not clear the value of maintaining v3.7 compatibility.

In any case, to fix the v3.7 issue, one need only replace the kwarg: data_source="NCEDC" with data_source='https://service.ncedc.org/' in make_parkfield_mth5

This argument is passed as base_url to obspy Client

To reproduce the error:

from obspy.clients.fdsn import Client
client = Client(base_url="NCEDC", force_redirect=True)

but replacing with

from obspy.clients.fdsn import Client
client = Client(base_url="https://service.ncedc.org/", force_redirect=True)

works.

This is discussed in comment by alexhutko. It has to do with hardcoded url lookup tables, and the fact that NCEDC is only available via https not http. This may get fixed in obspy, but if we want to support py37 we can just use the explcit url (for now).

To fix the py38 issue, one only needs to be using obspy v1.4

kkappler commented 1 year ago

Now that these tests are working again, there are a couple of things that can be done to simplify the parkfield tests:

We don't really need to support the creation of separate PKD and SAO, and PKDSAO h5 files
- [x] Replace all references to h5 files with pkd_sao_test_00.h5
- [x] A single method called ensure_data_exists() can be placed in /test_utils/parkfield/make_parkfield_mth5.py, and all the try/except stuff that is replicated in several methods can be placed in that one spot

kkappler commented 1 year ago

I pushed an h5 of the combined PKD and SAO data to mth5_test_data. in mth5_test_data/mth5/parkfield/pkd_sao_test_00.h5 It should be possible from this file to extract the metadata and the data-streams and archive these somewhere at IRIS.

When this is done, I suggest that the making of the PKD data, when using IRIS be done using make_mth5, instead of the NCEDC kluge we have implemented to work around their non-FDSN complient nomenclature.

simpeg / aurora

Aurora Pipeline Fails when NCEDC is down #159