EcoExtreML / zampy

Tool for downloading Land Surface Model input data
https://zampy.readthedocs.io/
Apache License 2.0
1 stars 0 forks source link

Add soil temperature and soil moisture to ERA5 data, STEMMUS_SCOPE recipe #53

Closed BSchilperoort closed 3 months ago

BSchilperoort commented 9 months ago

To add these I had to merge them and add a "depth" dimension. The original files are split by layer...

Closes #47

Note that the CDS is slow, so running the recipe might take a while. Running it overnight is probably best.

Example recipe:

# config (folder, login info etc goes to a ~/.zampy/config file)
name: "soil temperature test"

download:
  time: ["2019-01-01", "2019-01-31"]
  bbox: [54, 6, 53, 5] # NESW
  datasets:
    era5_land:
      variables:
        - soil_temperature
        - soil_moisture

convert:
  convention: ALMA
  frequency: 1H  # outputs at 1 hour frequency. Pandas-like freq-keyword.
  resolution: 0.25  # output resolution in degrees.
BSchilperoort commented 4 months ago

@SarahAlidoost you should be able to run the following recipe now, which will download all input data for STEMMUS_SCOPE for half a year. To make this recipe work for longer time periods we'll have to add a feature that adds NaNs if the requested start/end time is not appropriate for a dataset.

# config (folder, login info etc goes to a ~/.zampy/config file)
name: "STEMMUS_SCOPE_input"

download:
  time: ["2020-01-01", "2020-06-30"]
  bbox: [60, 10, 50, 0] # NESW
  datasets:
    era5_land:
      variables:
        - air_temperature
        - dewpoint_temperature
        - soil_temperature
        - soil_moisture
    era5:
      variables:
        - total_precipitation
        - surface_thermal_radiation_downwards
        - surface_solar_radiation_downwards
        - surface_pressure
        - eastward_component_of_wind
        - northward_component_of_wind
    eth_canopy_height:
      variables:
        - height_of_vegetation
    fapar_lai:
      variables:
        - leaf_area_index
    land_cover:
      variables:
        - land_cover
    prism_dem_90:
      variables:
        - elevation
    cams:
      variables:
        - co2_concentration

convert:
  convention: ALMA
  frequency: 1H  # outputs at 1 hour frequency. Pandas-like freq-keyword.
  resolution: 0.25  # output resolution in degrees.
BSchilperoort commented 4 months ago

Not sure why the Windows tests are failing. I am able to reproduce it on my machine, but it seems like the netCDF4 library or xarray is not releasing the file lock on the netCDF files. This makes the temp dir clean up fail because it cannot unlink the files.

I did not change that part of the code either, so it probably has something to do with a new version somewhere. Or with Dask because I did make the CI use Dask distributed (to avoid memory issues as the default scheduler is bad).

SarahAlidoost commented 4 months ago

Not sure why the Windows tests are failing. I am able to reproduce it on my machine, but it seems like the netCDF4 library or xarray is not releasing the file lock on the netCDF files. This makes the temp dir clean up fail because it cannot unlink the files.

I did not change that part of the code either, so it probably has something to do with a new version somewhere. Or with Dask because I did make the CI use Dask distributed (to avoid memory issues as the default scheduler is bad).

Looking at the log of action, it seems that there are different errors on windows:

E                   PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'D:\\a\\zampy\\zampy\\tests\\test_data\\fapar-lai\\tmp\\tmpvxggwz_9\\c3s_LAI_20190110000000_GLOBE_PROBAV_V3.0.1.nc'
E           NotADirectoryError: [WinError 267] The directory name is invalid: 'D:\\a\\zampy\\zampy\\tests\\test_data\\fapar-lai\\tmp\\tmpvxggwz_9\\c3s_LAI_20190110000000_GLOBE_PROBAV_V3.0.1.nc'
FAILED tests/test_datasets/test_fapar_lai.py::TestFaparLAI::test_ingest - NotADirectoryError: [WinError 267] The directory name is invalid: 'D:\\a\\zampy\\zampy\\tests\\test_data\\fapar-lai\\tmp\\tmpvxggwz_9\\c3s_LAI_20190110000000_GLOBE_PROBAV_V3.0.1.nc'

I think Dask workers cause these errors. Can you please refactor dask.distributed.Client() in test_fapar_lai.py::TestFaparLAI::test_ingest with submit and result methods of client and check if it fixes the errors.

BSchilperoort commented 4 months ago

FAPAR dataset test fixed, now a different test fails with a segfault due to rasterio...

Cause must be some dependency that has changed. My old environment passes all tests fine still.

sonarcloud[bot] commented 4 months ago

Quality Gate Failed Quality Gate failed

Failed conditions
55.8% Coverage on New Code (required ≥ 80%)

See analysis details on SonarCloud