testing framework - Githubissues

larsbuntemeyer commented 6 months ago

Just some ideas, could we maybe use a python test framework for data validation? That would be nice, e.g., to generate reports. If we don't want to stop on the first assertion, we could use, e.g.

unittest, subtests
pytest-subtests
pandera (mostly for pandas)
allure
xarray-schema
tsdat
pydantic
compliance-checker plugin (https://github.com/AtMoDat/cc-plugin-cmip6-cv)

larsbuntemeyer commented 6 months ago

Example to use subtests:

import unittest

class Car(object):
  def __init__(self, make, model):
    self.make = make
    self.model = make  # Copy and paste error: should be model.
    self.has_seats = True
    self.wheel_count = 3  # Typo: should be 4.

class CarTest(unittest.TestCase):
  def test_init(self):
    make = "Ford"
    model = "Model T"
    car = Car(make=make, model=model)
    with self.subTest(msg='Car.make check'):
        self.assertEqual(car.make, make)
    with self.subTest(msg='Car.model check'):
        self.assertEqual(car.model, model)
    with self.subTest(msg='Car.has_seats check'):
        self.assertTrue(car.has_seats)
    with self.subTest(msg='Car.wheel_count check'):
        self.assertEqual(car.wheel_count, 4)

if __name__ == "__main__":
    unittest.main()

larsbuntemeyer commented 6 months ago

examples using pydantic:

from pydantic import BaseModel, field_validator
class User(BaseModel):
    username: str
    password: str
    age: int
    @field_validator('password')
    def password_must_be_strong(cls, v):
        if len(v) < 16:
            raise ValueError('Password must be at least 16 characters long.')
            return v
    @field_validator('username')
    def username_must_be_strong(cls, v):
        if len(v) < 4:
            raise ValueError('Username must be at least 4 characters long.')
            return v

# Validate incoming user_data
user_data = {'username': 'App', 'password': 'password', 'age': 25}
user = User(**user_data)

larsbuntemeyer commented 6 months ago

it seems that data validation for n-d array/xarray is not really implemented yet. I guess for now, i will simply work with a logger and logging levels...

larsbuntemeyer commented 6 months ago

using conftest for input parameters with pytest:

import glob

def pytest_addoption(parser):
    parser.addoption(
        "--filename",
        action="store",
        type=str,
        help="list of files to pass to test functions",
    )
    parser.addoption("--cv", action="store", default="default name")

def pytest_generate_tests(metafunc):
    if "filename" in metafunc.fixturenames:
        metafunc.parametrize(
            "filename", glob.glob(str(metafunc.config.getoption("filename")))
        )

run with pytest, e.g.

pytest -s -v cmor_check/tests/test_cmor.py --filename "/work/bb1203/g300046_CMOR/_CMOR/NUKLEUS/output/EUR-11/GERICS/ECMWF-ERA5/evaluation/r1i1p1f1/GERICS-REMO2020/v1/day/tas/v20240402/tas_E    UR-11_ECMWF-ERA5_evaluation_r1i1p1f1_GERICS-REMO2020_v1_day_*" --cv /work/bb1203/g300046_CMOR/cmor-tables/Tables/CORDEX-CMIP6_CV.json --html=report.html --self-contained-html

euro-cordex / cmor-check

testing framework #1