Closed sol1105 closed 6 months ago
@sol1105 The new black release is causing some problems. I'll open a PR!
@Zeitsperre I think I got it, thanks :)
@cehbrecht @Zeitsperre It seems I cannot set the libnetcdf
/ netcdf-c
version for the pypi tests? For some reason it installs 4.9.3-development
which has a different behaviour with regards to the erroneous deflate options in the ATLAS datasets (#317). This version is not available for the conda tests (here 4.9.2
is getting installed). Do you have an idea how to go ahead?
@sol1105
libnetcdf
is not a Python package; It's a system-level library, so you can only install it via apt
, i.e. $ apt install libnetcdf-dev
. Anaconda offers a handful of non-Python packages (C, C++, Rust, etc.)in its repositories, libnetcdf
being one of them.
@sol1105
Is this a case of the system not having libnetcdf
or zlib
installed? In that case, you could add it to the job that does the testing by adding a new step:
- name: Install libnetcdf and zlib
run: |
apt-get update
apt-get upgrade
apt-get install libnetcdf-dev zlib1g-dev
@Zeitsperre Thanks for your reply. I do not know what the actual reason for the problem is. I added a few test datasets of the IPCC ATLAS datasets incl. tests. The problem is that the ATLAS CMIP5/6 and ATLAS CORDEX datasets include string variables with illegal filters. The netcdf-c
library had some breaking changes from 4.9.0
onwards and likely again in 4.9.3
in how to deal with such illegal filters. My intention was to simply remove these filters for the affected variables.
In the conda tests, everything works as expected. In the pypi+tox tests however, the affected string variables suddenly have a different data type (eg. 'dtype': dtype('<U35')
instead of 'dtype': <class 'str'>
). And the encoding dicitionaries of these variables are empty, apart from the dtype specification. This is why the tests fail. I could not reproduce this locally even with libnetcdf 4.9.3-development and latest xarray installed, as in the pypi+tox tests. It looks to me as if in some library this filter problem was addressed in that certain way, but I cannot identify which library does it.
I guess I will just set up a fix for the cases in that the filters are part of the encoding dictionary and not anymore try to make sense of what happens in the background.
Changes Missing Coverage | Covered Lines | Changed/Added Lines | % | ||
---|---|---|---|---|---|
clisops/ops/base_operation.py | 26 | 27 | 96.3% | ||
<!-- | Total: | 26 | 27 | 96.3% | --> |
Totals | |
---|---|
Change from base Build 7613866827: | 0.3% |
Covered Lines: | 1759 |
Relevant Lines: | 2422 |
@Zeitsperre I could now reproduce the issue locally after installing the latest xarray via pip and not from source. I set up two fixes that we require for the ATLAS datasets. Since the ATLAS datasets are provided with compression level 9, one of the fixes caps the compression level at 1 to reduce the write time of the requests (by factor 5 at least in some tests, eg. from 10min to 2min, while causing no significant changes in the output file size). I do not know if that cap can be considered a breaking change, I would say no. But in case you are not happy that these fixes become part of the general clisops
, the fixes can also be part of rook
.
If all is fine this can be merged once the connected roocs_utils
PR is part of a release.
@sol1105 do you like to update the requirements for roocs-utils >= 0.6.7 ?
Pull Request Checklist:
AUTHORS.md
What kind of change does this PR introduce?:
This PR adds tests to warrant support of ATLAS datasets. See also https://github.com/roocs/roocs-utils/pull/113 The environment changes are preliminary and will be reverted after the next
roocs_utils
release.Does this PR introduce a breaking change?:
No.
Other information: