gijzelaerr commented 8 years ago

While building a Casacore 2.1.0 package using the 'official' casacore debian files for Ubuntu 16.04, the compilation fails at the testing phase:

99% tests passed, 1 tests failed out of 447

Total Test time (real) =  13.51 sec

The following tests FAILED:
    155 - tArrayColumnCellSlices (Failed)
Errors while running CTest

Debian files are found here https://anonscm.debian.org/git/debian-astro/packages/casacore.git

tammojan commented 8 years ago

Hm, I built 2.1.0 from source on a clean Ubuntu 16.04, and did not get the error.

gijzelaerr commented 8 years ago

are you sure it was 16.04? that version is not officially released yet.

tammojan commented 8 years ago

Yes, I'm sure.

$ cat /etc/os-release
NAME="Ubuntu"
VERSION="16.04 (Xenial Xerus)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 16.04"
VERSION_ID="16.04"
HOME_URL="http://www.ubuntu.com/"
SUPPORT_URL="http://help.ubuntu.com/"
BUG_REPORT_URL="http://bugs.launchpad.net/ubuntu/"
UBUNTU_CODENAME=xenial

gijzelaerr commented 8 years ago

I have more tests failing.

On the PPA with Ubuntu 16.04, 64 bit:

The following tests FAILED:
     58 - tHashMap (Failed)

https://launchpadlibrarian.net/253434200/buildlog_ubuntu-xenial-amd64.casacore_2.1.0-1xenial1_BUILDING.txt.gz

On the PPA with Ubuntu 16.04, 32 bit:


The following tests FAILED:
    243 - tLSQaips (Failed)
    244 - tLSQFit (Failed)
    271 - tAutoDiff (Failed)
    273 - tClassicalStatistics (Failed)
    278 - tFitToHalfStatistics (Failed)
    281 - tHingesFencesStatistics (Failed)
    288 - tSparseDiff (Failed)
    348 - tLatticeRegion (Failed)
    349 - tLCComplement (Failed)
    350 - tLCConcatenation (Failed)
    351 - tLCDifference (Failed)
    361 - tLCRegion (Failed)
    364 - tLCUnion (Failed)
    419 - tDirectionCoordinate (Failed)

https://launchpadlibrarian.net/253433099/buildlog_ubuntu-xenial-i386.casacore_2.1.0-1xenial1_BUILDING.txt.gz

tammojan commented 8 years ago

It would really help if you could run those tests with --verbose.

gijzelaerr commented 8 years ago

yes I know, which is a bit cumbersome to do on the PPA's... I just place this here to have a central location for keeping track of this issue.

gijzelaerr commented 8 years ago

Ok I Think i know what is wrong. Trying to replicate the problem results in different failed tests, but when I run them individually there is no problem. Most likely a race condition where some tests are trying to use the same tables/files.

Long term solution is to make the test suite thread safe, short term solution is to not run the test suite in parallel. @tammojan do you know how to force ctest in serial mode? Debian automagically runs the test suite in parallel when you build the package in parallel.

tammojan commented 8 years ago

Perhaps you can override it in the debian file, something like

override_dh_auto_test:
    dh_auto_test --max-parallel=1 -- ARGS=""

olebole commented 8 years ago

OK, I am going to put some more verbosity here (by manually running the tests) on i386. I ran them one-by-one, so there is no multiprocess issue here. I am grouping them by directory:

casa/BasicMath

tMath

$BUILD$/obj-i686-linux-gnu/casa/BasicMath/test$ ./tMath
($BUILD$/casa/BasicMath/test/tMath.cc : 89) Failed AlwaysAssert roundDouble(x) == 21.5
FAIL

olebole commented 8 years ago

scimath/Fitting

tLSQaips

$BUILD$/obj-i686-linux-gnu/scimath/Fitting/test$ ./tLSQaips | diff -u tLSQaips.out -
--- tLSQaips.out    2016-09-23 10:30:22.665419520 +0000
+++ -   2016-09-23 10:37:03.273889640 +0000
@@ -422,8 +422,8 @@
 me:        1.67828e-11, 1.67828e-11
 ---------------------------------------------------
 Complex Non-linear------------
-Iterations: 15
-Ready:      Residual vector too small
+Iterations: 19
+Ready:      Incremental solution too small
 Sol:       (20,0), (25,0), (4,0)
 me:        0, 0
 ---------------------------------------------------

tLSQFit

$BUILD$/obj-i686-linux-gnu/scimath/Fitting/test$ ./tLSQFit | diff -u tLSQFit.out - 
--- tLSQFit.out 2016-09-23 10:30:22.725420439 +0000
+++ -   2016-09-23 11:04:23.501379836 +0000
@@ -523,8 +523,8 @@
 me:         1.67828e-11, 1.67828e-11
 ---------------------------------------------------
 Complex Non-linear------------
-Iterations: 15
-Ready:      Residual vector too small
+Iterations: 19
+Ready:      Incremental solution too small
 Sol:       (20,0), (25,0), (4,0)
 me:        0, 0
 ---------------------------------------------------

olebole commented 8 years ago

scimath/Mathematics

tAutoDiff

$BUILD$/obj-i686-linux-gnu/scimath/Mathematics/test$ ./tAutoDiff 
acos(const AutoDiff<T> &) failed
asin(const AutoDiff<T> &) failed
atan2(const AutoDiff<T> &, const AutoDiff<T> &g) failed
cosh(const AutoDiff<T> &) failed
log(const AutoDiff<T> &) failed
log10(const AutoDiff<T> &) failed
There were 6 errors

tClassicalStatistics

This test looks too tight for me (directly comparing two floats without any possible error).

$BUILD$/obj-i686-linux-gnu/scimath/Mathematics/test$ ./tClassicalStatistics 
rms 2.12132
($BUILD$/scimath/Mathematics/test/tClassicalStatistics.cc : 294) Failed AlwaysAssert sd.rms == sqrt(201.5/6.0)

tFitToHalfStatistics

This test looks too tight for me (directly comparing two floats without any possible error).

$BUILD$/obj-i686-linux-gnu/scimath/Mathematics/test$ ./tFitToHalfStatistics
($BUILD$/scimath/Mathematics/test/tFitToHalfStatistics.cc : 338) Failed AlwaysAssert fh.getStatistic( StatisticsData::RMS) == sqrt(sumsq/npts)

tHingesFencesStatistics

This test looks too tight for me (directly comparing two floats without any possible error).

$BUILD$/obj-i686-linux-gnu/scimath/Mathematics/test$ ./tHingesFencesStatistics
($BUILD$/scimath/Mathematics/test/tHingesFencesStatistics.cc : 273) Failed AlwaysAssert sd.rms == sqrt(201.5/6.0)

tSparseDiff

$BUILD$/obj-i686-linux-gnu/scimath/Mathematics/test$ ./tSparseDiff 
acos(const SparseDiff<T> &) failed
asin(const SparseDiff<T> &) failed
atan2(const SparseDiff<T> &, const SparseDiff<T> &g) failed
cosh(const SparseDiff<T> &) failed
log(const SparseDiff<T> &) failed
log10(const SparseDiff<T> &) failed
There were 6 errors

olebole commented 8 years ago

lattices/LRegions

tLatticeRegion

$BUILD$/obj-i686-linux-gnu/lattices/LRegions/test$ ./tLatticeRegion |diff -wU6 tLatticeRegion.out  -
--- tLatticeRegion.out  2016-09-23 10:30:41.625709932 +0000
+++ -   2016-09-23 11:14:44.214259969 +0000
@@ -1,13 +1,13 @@
 circle: Axis Lengths: [11, 11]  (NB: Matrix in Row/Column order)
 [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0
  0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
- 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0
  0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]

tLCComplement

$BUILD$/obj-i686-linux-gnu/lattices/LRegions/test$ ./tLCComplement | diff -U6 tLCComplement.out -
--- tLCComplement.out   2016-09-23 10:30:41.665710545 +0000
+++ -   2016-09-23 11:15:47.877522228 +0000
@@ -3,13 +3,13 @@
 Axis Lengths: [11, 20]  (NB: Matrix in Row/Column order)
 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
- 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1
+ 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]

tLCConcatenation

$BUILD$/obj-i686-linux-gnu/lattices/LRegions/test$ ./tLCConcatenation | diff -U13 tLCConcatenation.out -
--- tLCConcatenation.out    2016-09-23 10:30:41.705711158 +0000
+++ -   2016-09-23 11:22:41.563150515 +0000
@@ -2,27 +2,27 @@
 [0, 4, 0][10, 15, 2][11, 12, 3][11, 20, 3]
 Ndim=3 Axis Lengths: [11, 12, 3] 
 [0, 0, 0][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 1, 0][0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
 [0, 2, 0][0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
 [0, 3, 0][0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]
 [0, 4, 0][0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]
 [0, 5, 0][0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]
 [0, 6, 0][1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
 [0, 7, 0][0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]
 [0, 8, 0][0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]
 [0, 9, 0][0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0]
 [0, 10, 0][0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
-[0, 11, 0][0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]
+[0, 11, 0][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 0, 1][0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0]
 [0, 1, 1][0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0]
 [0, 2, 1][0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0]
 [0, 3, 1][0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0]
 [0, 4, 1][0, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0]
 [0, 5, 1][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 6, 1][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 7, 1][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 8, 1][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 9, 1][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 10, 1][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 11, 1][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
 [0, 0, 2][0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

tLCDifference

$BUILD$/obj-i686-linux-gnu/lattices/LRegions/test$ ./tLCDifference| diff -u8 tLCDifference.out -
--- tLCDifference.out   2016-09-23 10:30:41.749711832 +0000
+++ -   2016-09-23 11:24:27.466970999 +0000
@@ -10,17 +10,17 @@
 1 
 [0, 5][10, 15][11, 11][11, 20]
 Axis Lengths: [11, 11]  (NB: Matrix in Row/Column order)
 [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0
  0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0
  0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0
- 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1
+ 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0
  0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0
  0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0
  0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]

 1 
 [3, 4][7, 8][5, 5][10, 20]

tLCRegion

$BUILD$/obj-i686-linux-gnu/lattices/LRegions/test$ ./tLCRegion | diff -u8 tLCRegion.out  -
--- tLCRegion.out   2016-09-23 10:30:42.077716855 +0000
+++ -   2016-09-23 11:26:25.772373637 +0000
@@ -1,17 +1,17 @@
 0 []
 [3, 4][7, 8][5, 5][11, 20]com1
 1 Axis Lengths: [11, 11]  (NB: Matrix in Row/Column order)
 [0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0
  0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
- 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0
  0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]

 [0, 5][10, 15][11, 11][11, 20]
 [5, 10][5, 5]

tLCUnion

$BUILD$/obj-i686-linux-gnu/lattices/LRegions/test$ ./tLCUnion | diff -u8 tLCUnion.out  -
--- tLCUnion.out    2016-09-23 10:30:42.169718265 +0000
+++ -   2016-09-23 11:27:16.622267335 +0000
@@ -1,17 +1,17 @@
 1 
 [0, 4][10, 15][11, 12][11, 20]
 Axis Lengths: [11, 12]  (NB: Matrix in Row/Column order)
 [0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0
  0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0
  0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
- 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
+ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0
  0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0
  0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0]

 1 
 [0, 4][9, 19][10, 16][10, 20]

olebole commented 8 years ago

coordinates/Coordinates

tDirectionCoordinate

This test looks too tight for me (directly comparing two floats without any possible error).

$BUILD$/obj-i686-linux-gnu/coordinates/Coordinates/test$ ./tDirectionCoordinate 
pixel, world = [0, 0][0, 0]
cdelt [-1e-06, 2e-06]
aipserror: error ($BUILD$/coordinates/Coordinates/test/tDirectionCoordinate.cc : 286) Failed AlwaysAssert pixelArea.getValue() == fabs(cdelt[0]*cdelt[1])

tGaussianConvert

$BUILD$/obj-i686-linux-gnu/coordinates/Coordinates/test$ ./tGaussianConvert 
Major axis : pixel, world, pixel = 10 10 arcsec 10
Minor axis : pixel, world, pixel = 5 5 arcsec 5
Position Angle : pixel, world, pixel = 30 deg 30 deg 30 deg

Major axis : pixel, world, pixel = 10 10 arcsec 10
Minor axis : pixel, world, pixel = 5 5 arcsec 5
Position Angle : pixel, world, pixel = 30 deg 150 deg 30 deg

Major axis : pixel, world, pixel = 10 20 arcsec 10
Minor axis : pixel, world, pixel = 10 10 arcsec 10
Position Angle : pixel, world, pixel = 0 deg 0 deg 90 deg

aipserror: error ($BUILD$/coordinates/Coordinates/test/tGaussianConvert.cc : 174) Failed AlwaysAssert near(pa1.getValue(),pa3.getValue(),1e-6)

olebole commented 8 years ago

These failures are (more or less) specific to i386. There is one other 32-bit little-endian platform that succeeds (MIPS), and another "almost" succeeds (ARM; just tConvert failing). I therefore would guess that many of the failures come from the limited accuracy of the i386 floating point ops. Some other architectures share a few of these failures (s390x, arm64, mips64el, powerpc 32/64), but not all (mainly scimath).

casacore / casacore

tArrayColumnCellSlices test fail #394

casa/BasicMath

tMath

scimath/Fitting

tLSQaips

tLSQFit

scimath/Mathematics

tAutoDiff

tClassicalStatistics

tFitToHalfStatistics

tHingesFencesStatistics

tSparseDiff

lattices/LRegions

tLatticeRegion

tLCComplement

tLCConcatenation

tLCDifference

tLCRegion

tLCUnion

coordinates/Coordinates

tDirectionCoordinate

tGaussianConvert