ioos / ioos_metrics

Working on creating metrics for the IOOS by the numbers
https://ioos.github.io/ioos_metrics/
MIT License
2 stars 4 forks source link

National platforms #57

Closed ocefpaf closed 5 months ago

ocefpaf commented 7 months ago

Ported from #56:

CO-OPS

This code returns 381 stations but searvey reports a lower number (366) of active stations and higher total if we add the discontinued ones (446):

from searvey import coops
df = coops.coops_stations()
len(df.loc[df["status"] == "active"])
366
len(df.loc[df["status"] == "discontinued"])
80

Not sure which data source is better to answer that question. Searvey does have this warning but I'm not sure if that means the API they are hitting is outdated in terms of usability or data updates.

NERRS

The code returns 93 but there is a comment in the notebook that it should be around 140

PS: The current refactor is used in this example.

ocefpaf commented 6 months ago

@MathewBiddle we need to be a bit careful with this one b/c I had two difficult rebasing to catch up with #56 and #54. We should check if we still have all the metrics functions expected and/or if we don't have duplicates.

Once we merge this I'll send a final PR making it a package, publish, and then folks can run any metric they want by just installing this and calling the proper function.

MathewBiddle commented 5 months ago

gliderpy is throwing a user warning

C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\gliderpy\fetchers.py:47: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  df.index = pd.to_datetime(df.index)
ocefpaf commented 5 months ago

gliderpy is throwing a user warning

I have plans to investigate that later but it is not gliderpy it is some bad dates in the data that cannot be parsed with vanilla pandas. In theory, it is OK. However, as we are closely responsible for that date, we should investigate. I'll let you know what I find soon but I need to read every time variable data point to find which one is triggering that... So I'm not doing that today 😬

MathewBiddle commented 5 months ago

looks like I'm getting a timeout from, https://opendap.co-ops.nos.noaa.gov/stations/stationsXML.jsp.

df2 = ioos_metrics.ioos_metrics.update_metrics()
C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\gliderpy\fetchers.py:47: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  df.index = pd.to_datetime(df.index)
C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\gliderpy\fetchers.py:47: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  df.index = pd.to_datetime(df.index)
C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\gliderpy\fetchers.py:47: UserWarning: Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.
  df.index = pd.to_datetime(df.index)
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\urllib3\connectionpool.py", line 537, in _make_request
    response = conn.getresponse()
               ^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\urllib3\connection.py", line 466, in getresponse
    httplib_response = super().getresponse()
                       ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 1423, in getresponse
    response.begin()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 331, in begin
    version, status, reason = self._read_status()
                              ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 292, in _read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\socket.py", line 707, in readinto
    return self._sock.recv_into(b)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\ssl.py", line 1252, in recv_into
    return self.read(nbytes, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\ssl.py", line 1104, in read
    return self._sslobj.read(len, buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TimeoutError: The read operation timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\requests\adapters.py", line 486, in send
    resp = conn.urlopen(
           ^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\urllib3\connectionpool.py", line 847, in urlopen
    retries = retries.increment(
              ^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\urllib3\util\retry.py", line 470, in increment
    raise reraise(type(error), error, _stacktrace)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\urllib3\util\util.py", line 39, in reraise
    raise value
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\urllib3\connectionpool.py", line 793, in urlopen
    response = self._make_request(
               ^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\urllib3\connectionpool.py", line 539, in _make_request
    self._raise_timeout(err=e, url=url, timeout_value=read_timeout)
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\urllib3\connectionpool.py", line 370, in _raise_timeout
    raise ReadTimeoutError(
urllib3.exceptions.ReadTimeoutError: HTTPSConnectionPool(host='opendap.co-ops.nos.noaa.gov', port=443): Read timed out. (read timeout=10)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\externals\loky\process_executor.py", line 463, in _process_worker
    r = call_item()
        ^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\externals\loky\process_executor.py", line 291, in __call__
    return self.fn(*self.args, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 589, in __call__
    return [func(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\Documents\GitProjects\ioos_metrics\ioos_metrics\national_platforms.py", line 51, in get_coops
    xml = requests.get(
          ^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\requests\api.py", line 73, in get
    return request("get", url, params=params, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\requests\api.py", line 59, in request
    return session.request(method=method, url=url, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\requests\sessions.py", line 589, in request
    resp = self.send(prep, **send_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\requests\sessions.py", line 703, in send
    r = adapter.send(request, **kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\requests\adapters.py", line 532, in send
    raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='opendap.co-ops.nos.noaa.gov', port=443): Read timed out. (read timeout=10)
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\IPython\core\interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-6-da9d358039b7>", line 1, in <module>
    df2 = ioos_metrics.ioos_metrics.update_metrics()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\Documents\GitProjects\ioos_metrics\ioos_metrics\ioos_metrics.py", line 577, in update_metrics
    national_platforms = sum(
                         ^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1595, in _get_outputs
    yield from self._retrieve()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1699, in _retrieve
    self._raise_error_fast()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1734, in _raise_error_fast
    error_job.get_result(self.timeout)
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 736, in get_result
    return self._return_or_raise()
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 754, in _return_or_raise
    raise self._result
requests.exceptions.ReadTimeout: None: None

Url should be https://opendap.co-ops.nos.noaa.gov/stations/stationsXML.jsp (no clue what is different?)

ocefpaf commented 5 months ago

Url should be https://opendap.co-ops.nos.noaa.gov/stations/stationsXML.jsp (no clue what is different?)

If you try to navigate there you should get a 504. Let's see if the service is restored later today/tomorrow.

MathewBiddle commented 5 months ago

FYI, these tests have been running >13min...

ocefpaf commented 5 months ago

FYI, these tests have been running >13min...

The "slow" glider test takes that long. I'm waiting for Kathy's answer to figure out what to do with it.

ocefpaf commented 5 months ago

@MathewBiddle I made a few modifications to this PR after our slack conversation with Kathy:

  1. The default for the glider metric in update_metrics is the fast method now, no more waiting for >30 min!
  2. Both fast and slow glider methods are present but only slow returns profile count.
  3. The difference between the gliders methods varies depending on the query parameters but the largest difference I computed was 1.3%, that is less than before b/c I had a mistake in the end_date and I was comparing a full query vs a constrained one.
  4. We should not get any warning when parsing dates now. All dates are validated against ERDDAP's format and, if they aren't, the metric will fail and record the dataset_id in the logs.
MathewBiddle commented 5 months ago

I am trying to look through metric.log but there are a lot of pdfminer logs written. Below is the error I'm getting right now:

df2 = ioos_metrics.ioos_metrics.update_metrics()
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\urllib\request.py", line 1344, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 1331, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 1377, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 1326, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 1085, in _send_output
    self.send(msg)
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 1029, in send
    self.connect()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 1465, in connect
    super().connect()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\http\client.py", line 995, in connect
    self.sock = self._create_connection(
                ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\socket.py", line 852, in create_connection
    raise exceptions[0]
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\socket.py", line 837, in create_connection
    sock.connect(sa)
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\externals\loky\process_executor.py", line 463, in _process_worker
    r = call_item()
        ^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\externals\loky\process_executor.py", line 291, in __call__
    return self.fn(*self.args, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 589, in __call__
    return [func(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\Documents\GitProjects\ioos_metrics\ioos_metrics\ioos_metrics.py", line 323, in regional_platforms
    df = pd.read_json(url)
         ^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\pandas\io\json\_json.py", line 791, in read_json
    json_reader = JsonReader(
                  ^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\pandas\io\json\_json.py", line 904, in __init__
    data = self._get_data_from_filepath(filepath_or_buffer)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\pandas\io\json\_json.py", line 944, in _get_data_from_filepath
    self.handles = get_handle(
                   ^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\pandas\io\common.py", line 728, in get_handle
    ioargs = _get_filepath_or_buffer(
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\pandas\io\common.py", line 384, in _get_filepath_or_buffer
    with urlopen(req_info) as req:
         ^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\pandas\io\common.py", line 289, in urlopen
    return urllib.request.urlopen(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\urllib\request.py", line 215, in urlopen
    return opener.open(url, data, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\urllib\request.py", line 515, in open
    response = self._open(req, data)
               ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\urllib\request.py", line 532, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\urllib\request.py", line 492, in _call_chain
    result = func(*args)
             ^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\urllib\request.py", line 1392, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\urllib\request.py", line 1347, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [WinError 10061] No connection could be made because the target machine actively refused it>
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\IPython\core\interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-4-da9d358039b7>", line 1, in <module>
    df2 = ioos_metrics.ioos_metrics.update_metrics()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\Documents\GitProjects\ioos_metrics\ioos_metrics\ioos_metrics.py", line 593, in update_metrics
    columns = dict(zip(functions.keys(), values, strict=False))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1595, in _get_outputs
    yield from self._retrieve()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1699, in _retrieve
    self._raise_error_fast()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1734, in _raise_error_fast
    error_job.get_result(self.timeout)
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 736, in get_result
    return self._return_or_raise()
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 754, in _return_or_raise
    raise self._result
urllib.error.URLError: <urlopen error [WinError 10061] No connection could be made because the target machine actively refused it>
ocefpaf commented 5 months ago

It is our usual ERDDAP server down ;-p

https://erddap.ioos.us/erddap/tabledap/

I'll improve the logging system to make it easier to stop network problems.

ocefpaf commented 5 months ago

Both the our logs and the CI logs are easier to parse now. This is the CI logs:

tests/test_metrics.py:67: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
ioos_metrics/ioos_metrics.py:323: in regional_platforms
    df = pd.read_json(url)
../../../micromamba/envs/TEST/lib/python3.12/site-packages/pandas/io/json/_json.py:791: in read_json
    json_reader = JsonReader(
../../../micromamba/envs/TEST/lib/python3.12/site-packages/pandas/io/json/_json.py:904: in __init__
    data = self._get_data_from_filepath(filepath_or_buffer)
../../../micromamba/envs/TEST/lib/python3.12/site-packages/pandas/io/json/_json.py:944: in _get_data_from_filepath
    self.handles = get_handle(
../../../micromamba/envs/TEST/lib/python3.12/site-packages/pandas/io/common.py:728: in get_handle
    ioargs = _get_filepath_or_buffer(
../../../micromamba/envs/TEST/lib/python3.12/site-packages/pandas/io/common.py:384: in _get_filepath_or_buffer
    with urlopen(req_info) as req:
../../../micromamba/envs/TEST/lib/python3.12/site-packages/pandas/io/common.py:289: in urlopen
    return urllib.request.urlopen(*args, **kwargs)
../../../micromamba/envs/TEST/lib/python3.12/urllib/request.py:215: in urlopen
    return opener.open(url, data, timeout)
../../../micromamba/envs/TEST/lib/python3.12/urllib/request.py:515: in open
    response = self._open(req, data)
../../../micromamba/envs/TEST/lib/python3.12/urllib/request.py:532: in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
../../../micromamba/envs/TEST/lib/python3.12/urllib/request.py:492: in _call_chain
    result = func(*args)
../../../micromamba/envs/TEST/lib/python3.12/urllib/request.py:1392: in https_open
    return self.do_open(http.client.HTTPSConnection, req,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <urllib.request.HTTPSHandler object at 0x7f690b211bb0>
http_class = <class 'http.client.HTTPSConnection'>
req = <urllib.request.Request object at 0x7f69010865d0>
http_conn_args = {'context': <ssl.SSLContext object at 0x7f69080c3b50>}
host = 'erddap.ioos.us'
h = <http.client.HTTPSConnection object at 0x7f6901086630>
headers = {'Connection': 'close', 'Host': 'erddap.ioos.us', 'User-Agent': 'Python-urllib/3.12'}

Note that network issues here are expected and I hope to run this as a cron job to catch them early. The code per se is ready for review.

MathewBiddle commented 5 months ago

After rebooting ERDDAP (https://github.com/ioos/erddap-gold-standard/issues/69), this is running successfully now.

MathewBiddle commented 5 months ago

Testing this PR and I run into an issue with missing atn.

df2 = ioos_metrics.ioos_metrics.update_metrics()
joblib.externals.loky.process_executor._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\externals\loky\process_executor.py", line 463, in _process_worker
    r = call_item()
        ^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\externals\loky\process_executor.py", line 291, in __call__
    return self.fn(*self.args, **self.kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 589, in __call__
    return [func(*args, **kwargs)
            ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\Documents\GitProjects\ioos_metrics\ioos_metrics\ioos_metrics.py", line 335, in atn_deployments
    return atn
           ^^^
UnboundLocalError: cannot access local variable 'atn' where it is not associated with a value
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\IPython\core\interactiveshell.py", line 3577, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-3-da9d358039b7>", line 1, in <module>
    df2 = ioos_metrics.ioos_metrics.update_metrics()
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\Documents\GitProjects\ioos_metrics\ioos_metrics\ioos_metrics.py", line 575, in update_metrics
    columns = dict(zip(functions.keys(), values, strict=False))
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1595, in _get_outputs
    yield from self._retrieve()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1699, in _retrieve
    self._raise_error_fast()
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 1734, in _raise_error_fast
    error_job.get_result(self.timeout)
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 736, in get_result
    return self._return_or_raise()
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\Mathew.Biddle\programs\Miniforge\envs\ioos-metrics\Lib\site-packages\joblib\parallel.py", line 754, in _return_or_raise
    raise self._result
UnboundLocalError: cannot access local variable 'atn' where it is not associated with a value
MathewBiddle commented 5 months ago

This looks good now. I'm merging so we can move forward. However, ATN website needs to be updated and then the full update_metrics can run.

pdf miner logs are still unbearable.

ocefpaf commented 5 months ago

I approve this knowing that ATN deployments will fail at the moment.

I'll open issue for cleaner logs and fix the ATN URL.