Closed rstrange-scottlogic closed 4 months ago
Basic process
In terms of regression we target specific documents and assert the values shown in the database. Failure of these tests would simple be an alert that the code has changed in terms of the etl process.
Plans to use some xarray functionality to grab a raw value from our GRIB. Then to manually go through all the calculations and store it as a variable. Then we assert that variable matches whatever we have within our database. As opposed to asserting a value such as 10.321472471264862.
Extra modifications to run_forecast_etl to allow overriding of base date and time from default value.
As testers we have used xarray ourselves to open a known grib file and manually do the calculations required to get to the final value stored in the DB. By getting an exact match for single level data and a very near match with multi_level data we can confidently assert exact values that are coming from the database.
example code
def epic_interpolation():
file_path = "single_level_2024-06-04_00.grib"
ds = xr.open_dataset(file_path, engine="cfgrib")
latitudes = ds['latitude'].values
longitudes = ds['longitude'].values
pm2_5_data = ds['pm2p5'].isel(step=0).values
target_lat = 25.0657
target_lon = 55.17128
if target_lon < 0:
target_lon += 360
interpolator = scipy.interpolate.interp2d(longitudes, latitudes, pm2_5_data, kind='linear')
pm2_5_value = interpolator(target_lon, target_lat)
pprint.pprint(pm2_5_value[0] * 10 ** 9)
Acceptance Criteria: