Open hjaekel opened 1 year ago
This is a known problem. It as to do with running tests in parallel during make check. There is a race condition that we have not yet found. If you re-run make check, the odds are good that it will work.
Well this can be fixed by adding the right line to Makefile.am. See https://stackoverflow.com/questions/17172310/make-disable-parallel-building-in-subdirectory-for-single-target-only.
We use Ninja, so I guess the change in the Makefile will have no effect. I tried with ctest -j 1
with the same test failure than before. Finally I switched to
CTEST_OUTPUT_ON_FAILURE=1 ctest -R "ncdump_tst_netcdf4_4"
CTEST_OUTPUT_ON_FAILURE=1 ctest -E "ncdump_tst_netcdf4_4 nc_test4_tst_large2"
This should prevent race conditions from occurring. However, the test still fails on x86. This is reproducible and only happens on x86. On all other platforms (aarch64, armhf, armv7, ppc64le and x86_64) the test runs successfully. You can see the ci pipelines here: https://gitlab.alpinelinux.org/hjaekel/aports/-/pipelines/153046
I have the same FAIL in check. (46 PASS and 1 FAIL)
I tried to run the script netcdf-c/ncdump/tst_netcdf4_4.sh independently and I think the problem is related to ncgen.
The type of filter applied to variable 5 changes :
var5:_Filter = "3|2,40|1,2"
--> var5:_Filter = "3|2,36|1,2"
Could you help me ?
Thanks
After a quick look, I think this may be a compound type packing problem. Specifically, the middle filter 2 refers to the shuffle filter. It technically has no argument, but apparently, the size of the compound type is being included as an argument for the filter. So in this case, the baseline file assumes that the compound type size is 40, but on the platform/compiler you are using, it has a size of 36. I will investigate, if I can, if my speculation is correct. Do you know what compiler and compiler version you are using?
Thanks a lot for the support. The compiler version is GCC 4.4.7 20120313 (Red Hat 4.4.7-18).
That is a pretty old version of gcc, I think. I am not sure we can fix the problem if it is struct type packing issue. Any chance you test against a much more recent version of gcc. Perhaps you have a similar platform with a newer version of gcc?
I know that the GCC version is very old, but at the moment I can't update it, it's a constraint. From what I understand by running only the test netcdf-c/ncdump/tst_netcdf4_4.sh, I exclude the suspicion of the race condition related to a parallel execution of the tests, and I attribute the fail to the compiler. Is this correct? This would mean that the library compiled in this way is to be considered "corrupted" and could cause problems in use. Thanks.
I have not investigated thoroughly, but yes, in my opinion, the failure is due to a change in the gcc compiler. Presumably as more people use that compiler version, we will start to see reports of similar failures.
I'm trying to package netcdf-c 4.9.1 on Alpine Linux Edge. The tests pass on all platforms except one test:
ncdump_tst_netcdf4_4
.I use the following statements to compile and execute the tests: