Closed jhardenberg closed 1 year ago
The test is failing, but it does not fail if I run it locally on my machine. The error message contains:
_FAILED tests/basic_test.py::test_plev_gaussian[con] - ValueError: Dimensions {'num_links'} do not exist. Expected one or more of Frozen({'src_grid_rank': 2, 'dst_grid_rank': 2, 'src_grid_size': 131072, 'dst_grid_size': 64800, 'numLinks': 378448, 'numwgts': 1})
I wonder....is a different version of CDO creating weights where num_links
is called numLinks
?
Confirmed: The tests pull in CDO 2.2.0 which now has a dimension `numLinks' instead of 'num_links' which cdo 2.1.1 was using !
this should fix it ....
So I added new fesom3d file to test the 3d level interpolation, and unfortunately on my machine the test is not passed when using a conservative remapping. On the other hand, it works if we are using nearest neighbour. This might be a problem on the file properties, but we should double check it.
I might have found the issue: apparently the mask is not applied in the smmregrid remapping. As you can see below, the netcdf file generated with smmregrid is not having any missing value. CDO is also not doing a great job since the mask is the same for all the levels, but at least is putting NaN where it should. Need to understand why the mask is not propagated to all the layers.
Files are on levante: /home/b/b382076/smmregrid
[b382076@levante0 smmregrid]$ cdo info smm.nc
Warning (cdfInqContents): Coordinates variable variable can't be assigned!
Warning (cdfInqContents): Coordinates variable time can't be assigned!
-1 : Date Time Level Gridsize Miss : Minimum Mean Maximum : Parameter ID
1 : 1986-12-31 23:45:00 2.5 64800 0 : -1.8583 8.5943 29.782 : -1
2 : 1986-12-31 23:45:00 7.5 64800 0 : -1.8579 8.5634 29.743 : -1
3 : 1986-12-31 23:45:00 15 64800 0 : -1.8572 8.4829 29.737 : -1
4 : 1986-12-31 23:45:00 25 64800 0 : -1.8891 8.3123 29.772 : -1
5 : 1986-12-31 23:45:00 35 64800 0 : -1.9698 8.0983 29.848 : -1
6 : 1986-12-31 23:45:00 45 64800 0 : -1.9717 7.8931 29.867 : -1
7 : 1986-12-31 23:45:00 55 64800 0 : -1.9649 7.7073 29.653 : -1
8 : 1986-12-31 23:45:00 65 64800 0 : -1.9316 7.5341 29.381 : -1
9 : 1986-12-31 23:45:00 75 64800 0 : -1.8682 7.3743 29.106 : -1
10 : 1986-12-31 23:45:00 85 64800 0 : -1.8797 7.2592 28.964 : -1
11 : 1986-12-31 23:45:00 95 64800 0 : -1.8890 7.1679 28.629 : -1
12 : 1986-12-31 23:45:00 107.5 64800 0 : -1.8894 7.0171 27.890 : -1
13 : 1986-12-31 23:45:00 125 64800 0 : -1.8556 6.7721 26.437 : -1
14 : 1986-12-31 23:45:00 147.5 64800 0 : -1.8560 6.4505 24.660 : -1
15 : 1986-12-31 23:45:00 175 64800 0 : -1.8315 6.0740 23.638 : -1
16 : 1986-12-31 23:45:00 210 64800 0 : -1.7406 5.6692 22.056 : -1
17 : 1986-12-31 23:45:00 255 64800 0 : -1.7368 5.2361 20.197 : -1
18 : 1986-12-31 23:45:00 310 64800 0 : -1.7269 4.7805 19.198 : -1
19 : 1986-12-31 23:45:00 375 64800 0 : -1.6926 4.3251 18.713 : -1
20 : 1986-12-31 23:45:00 450 64800 0 : -1.6529 3.8692 17.744 : -1
21 : 1986-12-31 23:45:00 535 64800 0 : -1.6541 3.4431 16.821 : -1
22 : 1986-12-31 23:45:00 630 64800 0 : -0.57297 3.0463 15.154 : -1
23 : 1986-12-31 23:45:00 735 64800 0 : -0.75147 2.6816 12.663 : -1
24 : 1986-12-31 23:45:00 850 64800 0 : -0.80117 2.3456 10.665 : -1
25 : 1986-12-31 23:45:00 975 64800 0 : -0.85103 2.0424 10.353 : -1
26 : 1986-12-31 23:45:00 1110 64800 0 : -0.95007 1.7964 9.8227 : -1
27 : 1986-12-31 23:45:00 1255 64800 0 : -1.0251 1.5809 8.9991 : -1
28 : 1986-12-31 23:45:00 1415 64800 0 : -1.1271 1.3782 8.0320 : -1
29 : 1986-12-31 23:45:00 1600 64800 0 : -1.1997 1.1908 6.9279 : -1
30 : 1986-12-31 23:45:00 1810 64800 0 : -1.2550 1.0260 5.7755 : -1
31 : 1986-12-31 23:45:00 2035 64800 0 : -1.2895 0.88942 4.5920 : -1
32 : 1986-12-31 23:45:00 2275 64800 0 : -1.3196 0.77188 3.7286 : -1
33 : 1986-12-31 23:45:00 2525 64800 0 : -1.3540 0.65899 3.2751 : -1
34 : 1986-12-31 23:45:00 2775 64800 0 : -1.3814 0.56712 2.9536 : -1
35 : 1986-12-31 23:45:00 3025 64800 0 : -1.3983 0.49131 2.7294 : -1
36 : 1986-12-31 23:45:00 3275 64800 0 : -0.88223 0.41555 2.5675 : -1
37 : 1986-12-31 23:45:00 3525 64800 0 : -0.88830 0.32844 2.2401 : -1
38 : 1986-12-31 23:45:00 3775 64800 0 : -0.93917 0.26508 2.0859 : -1
39 : 1986-12-31 23:45:00 4025 64800 0 : -1.0316 0.18784 1.9207 : -1
40 : 1986-12-31 23:45:00 4275 64800 0 : -1.1215 0.13231 1.8011 : -1
41 : 1986-12-31 23:45:00 4525 64800 0 : -1.1943 0.090269 1.7431 : -1
42 : 1986-12-31 23:45:00 4775 64800 0 : -1.1988 0.065693 1.6709 : -1
43 : 1986-12-31 23:45:00 5025 64800 0 : -1.2062 0.044022 1.5195 : -1
44 : 1986-12-31 23:45:00 5275 64800 0 : -0.56853 0.022048 1.2176 : -1
45 : 1986-12-31 23:45:00 5525 64800 0 : 0.0000 0.0053951 1.0966 : -1
46 : 1986-12-31 23:45:00 5825 64800 0 : 0.0000 0.0000 0.0000 : -1
47 : 1986-12-31 23:45:00 6125 64800 0 : 0.0000 0.0000 0.0000 : -1
I think that the problem is here:
# this section is used to create a target mask initializing the CDO weights
if not vert_coord:
self.weights = mask_weights(self.weights, self.weights_matrix)
self.masked = check_mask(self.weights)
else:
self.masked = False
self.space_dims = space_dims
I added also nemo 3d data, which are not working in both conservative and nearest neighbour (likely for the mask issue)
So at least locally the tests are passed, with a couple of modification to the way the masks are computed. However, the code is currently a REAL MESS so that we need to clean it up, and uniform the different 2d/3d calls. Also, it would be great if we can avoid specifying vert_coord
but the interpolation do it by himself. Need to double check also the speed now. We are close anyway!
I made a lot of cleaning, including automatic vert_coord discovery and splitting the regrid.py in multiple files. the generation is still very much cdo-dependent, but it is a bit easier to browse now.
We need to explore if we can exploit xarray ufunc to speed up some of the 3d computations
So I had to modify a few things to be able to initialize the regridder using the weights. This actually should be a new family of tests which is not taken into consideration so far
Big difference so far is that I set the vert_coord
also into the weights file so that it can be automatically interpreted when loading the weights
It does not seem to be super efficient with FESOM... need probably to check the process...
NVars | NRecords | CDO | CDO (NoLoad) | SMM (Dataset) | SMM (DataArray) | SMM (DataArray+NoLoad) | SMM (Dataset+Write) | SMM (DataArray+Write) | |
---|---|---|---|---|---|---|---|---|---|
so3d-nemo.nc | 1 | (2, 18, 292, 362) | 1 | 0.989918 | 0.029745 | 0.0292497 | 0.0126237 | 0.0345753 | 0.0343045 |
tas-ecearth.nc | 1 | (12, 256, 512) | 1 | 0.956647 | 0.300138 | 0.232395 | 0.0717874 | 0.31714 | 0.300978 |
tas-healpix2.nc | 1 | (12, 12288) | 1 | 0.930404 | 0.225058 | 0.154463 | 0.0380838 | 0.245379 | 0.238894 |
onlytos-ipsl.nc | 1 | (12, 332, 362) | 1 | 0.950344 | 0.307111 | 0.236941 | 0.0715755 | 0.320038 | 0.302833 |
2t-era5.nc | 1 | (12, 73, 144) | 1 | 0.922386 | 0.198507 | 0.141997 | 0.0386355 | 0.246714 | 0.211525 |
tos-fesom.nc | 1 | (12, 126859) | 1 | 0.970594 | 0.217977 | 0.188369 | 0.0419147 | 0.239474 | 0.242006 |
ua-ecearth.nc | 1 | (2, 19, 256, 512) | 1 | 0.877348 | 0.430238 | 0.37449 | 0.136358 | 0.658567 | 0.612724 |
mix-cesm.nc | 4 | (12, 192, 288) | 1 | 0.842065 | 0.499367 | 0.13384 | 0.0380408 | 0.740233 | 0.194893 |
temp3d-fesom.nc | 1 | (1, 47, 3140) | 1 | 0.825355 | 2.79075 | 2.73592 | 1.53475 | 3.16227 | 3.13012 |
@jhardenberg, would you like to review this and if you are happy then merge? So that we can try to export in other projects...
Hi, I am doing some final tests myself and will give my thumbs up later
@oloapinivad one silly question: how do I access the log or change log level?
I am not using any internal logger, but the main one. So you can change the root
level of logging. This is something which needs to be updated with a proper logger for the class as done elsewhere
The fact that the python 3.7 test is failing is a bit of a mistery....the logs actually seem to indicate that it is running with python 3.10 ????
platform linux -- Python 3.10.11, pytest-7.3.1, pluggy-1.0.0
Flake8 apparently installs python3.10, WTF
All looks good. @oloapinivad should we merge? The speed issues which you noticed can be examined in a separate PR
This is an implementation of 3D regridding. When
vert_coord
is specified 3D weights can be generated and used. The weights are stored in a 3d array with the same number of levels as the original data to be regridded. It supports parallel processing (we can specify a number of processorsnproc
) for weight generation. This is needed since the process can be extremely time consuming and at the same time memory intensive. Regridding itself, once the weights have been generated, is extremely fast.Solves #7