MHKiT-Software / MHKiT-Python

MHKiT-Python provides the marine renewable energy (MRE) community tools for data processing, visualization, quality control, resource assessment, and device performance.
https://mhkit-software.github.io/MHKiT/
BSD 3-Clause "New" or "Revised" License
47 stars 45 forks source link

Improve Testing Time #241

Closed ssolson closed 8 months ago

ssolson commented 1 year ago

A major headwind to MHKiT development is the time it takes to run the testing suite.

This PR decreases the testing time by creating a data cache prior to running the testing suite. This step can be seen as the first block in the work flow below called prepare-cache: image

The primary approach here is to use data caching to only call the APIs once. This should additionally reduce test failures and subsequent test re-runs.

While hindcast calls are the primary source of delay and failled API calls the caching was added to all of the API calls listed below: API Calls to add cacheing to:

  1. wave/
    • ndbc
    • cdip
    • hindcast
  2. tidal/
    • noaa
  3. river/
    • usgs

The caching is handled through a single function handle_cache located in mhkit/utils/. A new handle cache test was added for this module.

To utilize the caching functionality in our testing suite the main.yml file added the prepare-cache step as shown in the flow diagram above. All tests depend on this step and will not continue whiteout a successful prepare-cache run.

This PR removes python 3.7 support due to the most recent release breaking the last 3.7 compatibility we had (See discussion in thread below).

An important follow on to this PR is to only run the hindcast calls if the hindcast files have been modified and to add python 3.10 support.

A modification was made to the contours.py file to fix a deprecation with latest matplotlib release v3.8.0.

ssolson commented 1 year ago

API Calls to add cacheing to:

  1. wave/
    • [x] ndbc
    • [x] cdip
    • [x] hindcast
  2. tidal/
    • [x] noaa
  3. river/
    • [x] usgs
ssolson commented 1 year ago

@jmcvey3 I have not looked at your latest PR or into these error but have you already solved these dolfyn xarray failures? image

ssolson commented 1 year ago

Wind Tool API issues

This post aggregates useful resources and communicates my current progress.

My current road block on this PR is the wind-hindcast bc the API will not consistiently return data. Similar to what I did for the wave hindcast I have been digging into the base layer of the API calls using bothNREL-REX and h5pyd.

Wind Data Endpoints

To start a list of available data can be found here:

https://github.com/NREL/hsds-examples/blob/master/datasets/WINDToolkit.md

E.g. an endpoint we could use: /nrel/wtk/offshore_ca/Offshore_CA_2000.h5

Examples of accessing the endpoints

Exampls on how to access the data at these endpoints are covered in these notebooks:

  1. https://github.com/NREL/hsds-examples/blob/master/notebooks/01_WTK_introduction.ipynb
  2. https://github.com/NREL/hsds-examples/blob/master/notebooks/08_NREL-rex.ipynb

Example using h5pyd

So using h5pyd the minimumworking example would be:

import h5pyd
data_path = '/nrel/wtk/offshore_ca/Offshore_CA_2000.h5'
f = h5pyd.File(data_path, 'r')
dset = f['windspeed_100m']
dset[1, 1]

This consistiently returns

*** ValueError: cannot reshape array of size 0 into shape (1,)

Example using NREL-REX

We can do the same using NREL-REX as follows:

from rex import WindX
data_path = '/nrel/wtk/offshore_ca/Offshore_CA_2000.h5'
with WindX(data_path , hsds=True) as rex_wind:
    ds = rex_wind['windspeed_100m', 1, 1]

Which again will constiently return an array of size 0

ValueError: cannot reshape array of size 0 into shape (1,)

ssolson commented 1 year ago

Raised the issue with the NREL development folks

https://github.com/NREL/developer.nrel.gov/issues/320

ssolson commented 12 months ago

image

ssolson commented 12 months ago

Looking at the prepare cache vs hindcast we can see that the hindcast test only took 2 minutes. image

ssolson commented 9 months ago

Something is going on with python 3.9 and the test_kde_copula of the contours. I have not modified this file so most likely a dependency change. The test completes using 3.9 on my local machine without error. image

ssolson commented 9 months ago

Matplotlib (https://github.com/matplotlib/matplotlib) release v3.8.0 5 days ago and this deprecated the following lines (1554:1556) in contours.py:

    for i, seg in enumerate(vals.allsegs[0]):
        x1_bivariate_KDE.append(seg[:, 1])
        x2_bivariate_KDE.append(seg[:, 0])

Specifically vals is a variable of type QuadContourSet. and the updated call should be:

    for i, seg in enumerate(vals.get_paths()):
        x1_bivariate_KDE.append(seg.vertices[:, 1])
        x2_bivariate_KDE.append(seg.vertices[:, 0])
ssolson commented 9 months ago

😃 oc the fix is not backwards compatible with 3.8 image

ssolson commented 9 months ago

Python 3.7 is failing for all tests due to a Dolfyn method and an xarray new release.

Python 3.7 was released on 27 June 2018 making it over 5 years old. My vote is to drop 3.7 support in this PR and in a follow on PR add Python 3.10 support for the next MHKiT release.

image