mlpack / ensmallen

A header-only C++ library for numerical optimization --
http://ensmallen.org
Other
742 stars 120 forks source link

Tests fail for ensmallen-2.10.4 #142

Closed yurivict closed 4 years ago

yurivict commented 4 years ago
    Start 1: ensmallen_tests
1/1 Test #1: ensmallen_tests ..................***Failed  363.56 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) = 363.57 sec

The following tests FAILED:
      1 - ensmallen_tests (Failed)
Errors while running CTest

OS: FreeBSD

zoq commented 4 years ago

Hello @yurivict I couldn't reproduce the issue on FreeBSD 11.2, can you share some more information about your setup, OS, clang, armadillo version. Also does ./ensmallen_tests -s -r console show any intermediate results?

yurivict commented 4 years ago

I added the port for ensmallen yesterday, so if you would just type cd /usr/ports/math/ensmallen && make test it should reproduce the problem.

rcurtin commented 4 years ago

If you can run with CTEST_OUTPUT_ON_FAILURE=1 ctest that would be really helpful too. ensmallen has some random tests. We try to keep the random failure probability really low, but we're not always successful, so if you can show us which test failed we can figure out if it's random or an actual problem.

yurivict commented 4 years ago

It passes now when I added CTEST_OUTPUT_ON_FAILURE=1

yurivict commented 4 years ago

I added this patch to the FreeBSD port to work around this issue:

--- src/libtfhe/fft_processors/nayuki/fft_processor_nayuki.cpp.orig     2019-10-11 03:07:51 UTC
+++ src/libtfhe/fft_processors/nayuki/fft_processor_nayuki.cpp
@@ -12,7 +12,7 @@ FFT_Processor_nayuki::FFT_Processor_nayuki(const int32
     tables_reverse = fft_init_reverse(_2N);
     omegaxminus1 = (cplx*) malloc(sizeof(cplx) * _2N);
     for (int32_t x=0; x<_2N; x++) {

+       omegaxminus1[x]=std::complex<double>(cos(x*M_PI/N)-1., sin(x*M_PI/N));
        //exp(i.x.pi/N)-1
     }
 }
zoq commented 4 years ago

Hm, is this the correct patch? This code isn't part of ensmallen, maybe I missed something?

barak commented 4 years ago

Even with the "ten billion" issue fixed, I'm getting build failures on some 32-bit architectures. This is Debian version 2.10.4-2 (the -2 includes the ten billion patch).

https://buildd.debian.org/status/package.php?p=ensmallen

On armel it times out.

On i386


Test project /<<PKGBUILDDIR>>/obj-i686-linux-gnu
    Start 1: ensmallen_tests
1/1 Test #1: ensmallen_tests ..................***Failed  3509.70 sec
ensmallen version: 2.10.4 (Fried Chicken)
armadillo version: 9.800.1 (Horizon Scraper)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ensmallen_tests is a Catch v2.4.1 host application.
Run with -? for options

-------------------------------------------------------------------------------
RosenbrockFunctionFloatTest
-------------------------------------------------------------------------------
/<<PKGBUILDDIR>>/tests/lbfgs_test.cpp:41
...............................................................................

/<<PKGBUILDDIR>>/tests/lbfgs_test.cpp:54: FAILED:
  REQUIRE( coords(1) == Approx(1.0).epsilon(1e-7) )
with expansion:
  1.0f == Approx( 1.0 )

Parameter 4 to routine SLASCL was incorrect
Parameter 5 to routine SLASCL was incorrect
-------------------------------------------------------------------------------
Johnson844LovaszThetaFMatSDP
-------------------------------------------------------------------------------
/<<PKGBUILDDIR>>/tests/lrsdp_test.cpp:136
...............................................................................

/<<PKGBUILDDIR>>/tests/lrsdp_test.cpp:161: FAILED:
  REQUIRE( finalValue == Approx(-14.0).epsilon(0.1) )
with expansion:
  nanf == Approx( -14.0 )

===============================================================================
test cases:   266 |   264 passed | 2 failed
assertions: 11281 | 11279 passed | 2 failed

0% tests passed, 1 tests failed out of 1

Total Test time (real) = 3509.70 sec

The following tests FAILED:
      1 - ensmallen_tests (Failed)
conradsnicta commented 4 years ago

SLASCL is from LAPACK. ensmallen uses LAPACK indirectly via Armadillo. However, no function in Armadillo directly calls SLASCL. This suggests that another function in LAPACK (used by Armadillo) calls SLASCL, which in turn suggests the problem is with LAPACK.

About 6 months ago there was a very messy issue with Fortran "hidden" arguments (which affects a lot of software using LAPACK), but that was resolved as of Armadillo 9.500. For more details and further links see https://gitlab.com/conradsnicta/armadillo-code/issues/123

What version of gcc was used to compile LAPACK (or OpenBLAS) on Debian? Perhaps there's a miscompliation? Recent releases of gcc have a workaround for the Fortran "hidden" arguments snafu.

barak commented 4 years ago

That sounds plausible, @conradsnicta. But Debian LAPACK was recompiled just two weeks ago: https://tracker.debian.org/pkg/lapack

Maybe they don't have the issue under as much control as they thought.

Another possibility is libarmadillo, which actually has a bug filed against it for an architecture-specific header issue breaking mlpack on hppa: https://bugs.debian.org/912778

conradsnicta commented 4 years ago

@barak To be clear - by libarmadillo do you mean all of armadillo, or only the armadillo run-time library? https://bugs.debian.org/912778 doesn't seem related to the armadillo run-time library (libarmadillo.so). The error comes from armadillo headers (99% of armadillo is in header files), but I suspect it's a compiler bug on hppa.

hppa appears to be a rather obscure and outdated architecture. If it doesn't get much testing/development, a compiler bug is entirely possible. (Why does Debian even bother with hppa?)

barak commented 4 years ago

Regarding why Debian keep the hppa port going, well it's not an official Debian release architecture, it's just some hppa enthusiasts. (The official ones are the top nine listed on https://buildd.debian.org/status/package.php?p=ensmallen with white background; the remainder with gray background are just for fun.) So hppa won't hold up package migration into the stable distribution. But weird architectures are still useful for sniffing out portability issues and such.

On the present matter, when ensmallen was compiled it got the libarmadillo headers extant at the time, which were pretty recent.

The only architectures with testing failures for ensmallen are 32-bit, which does seem suspicious.

barak commented 4 years ago

Uploaded 2.11.1 to debian, and the autobuilders are still failing on a couple 32-bit architectures. Including two "release" architectures (i386 and armel) which will block migration of the package to release.

Details: https://buildd.debian.org/status/package.php?p=ensmallen click on the "Build-Attempted".

rcurtin commented 4 years ago

Digging through the logs here are my takes:

armel: https://buildd.debian.org/status/fetch.php?pkg=ensmallen&arch=armel&ver=2.11.1-1&stamp=1577965836&raw=0

Test project /<<PKGBUILDDIR>>/obj-arm-linux-gnueabi
    Start 1: ensmallen_tests
E: Build killed with signal TERM after 150 minutes of inactivity

It's unclear to me whether tests hung or just didn't finish yet because armel is for toasters. You could run ctest with various options to provide output during testing, and this might fix it.

i386: https://buildd.debian.org/status/fetch.php?pkg=ensmallen&arch=i386&ver=2.11.1-1&stamp=1577958809&raw=0

/usr/bin/ctest --force-new-ctest-process -j4
Test project /<<PKGBUILDDIR>>/obj-i686-linux-gnu
    Start 1: ensmallen_tests
1/1 Test #1: ensmallen_tests ..................***Failed  3185.66 sec
ensmallen version: 2.11.1 (The Poster Session Is Full)
armadillo version: 9.800.1 (Horizon Scraper)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ensmallen_tests is a Catch v2.4.1 host application.
Run with -? for options

-------------------------------------------------------------------------------
RosenbrockFunctionFloatTest
-------------------------------------------------------------------------------
/<<PKGBUILDDIR>>/tests/lbfgs_test.cpp:41
...............................................................................

/<<PKGBUILDDIR>>/tests/lbfgs_test.cpp:54: FAILED:
  REQUIRE( coords(1) == Approx(1.0).epsilon(1e-7) )
with expansion:
  1.0f == Approx( 1.0 )

Parameter 4 to routine SLASCL was incorrect
Parameter 5 to routine SLASCL was incorrect
-------------------------------------------------------------------------------
Johnson844LovaszThetaFMatSDP
-------------------------------------------------------------------------------
/<<PKGBUILDDIR>>/tests/lrsdp_test.cpp:136
...............................................................................

/<<PKGBUILDDIR>>/tests/lrsdp_test.cpp:161: FAILED:
  REQUIRE( finalValue == Approx(-14.0).epsilon(0.1) )
with expansion:
  nanf == Approx( -14.0 )

===============================================================================
test cases:   279 |   277 passed | 2 failed
assertions: 11309 | 11307 passed | 2 failed

0% tests passed, 1 tests failed out of 1

Total Test time (real) = 3185.66 sec

The following tests FAILED:
      1 - ensmallen_tests (Failed)
Errors while running CTest
make[1]: *** [Makefile:111: test] Error 8
make[1]: Leaving directory '/<<PKGBUILDDIR>>/obj-i686-linux-gnu'
dh_auto_test: cd obj-i686-linux-gnu && make -j4 test ARGS\+=-j4 returned exit code 2
make: *** [debian/rules:8: binary-arch] Error 255
dpkg-buildpackage: error: debian/rules binary-arch subprocess returned exit status 2

I see the issue with RosenbrockFunctionFloatTest and will open a PR momentarily, but the Johnson844LovaszThetaFMatSDP one might be trickier. I'm going to see if I can get an i386 machine or VM up and running to try and reproduce. The SLASCL failures might actually not be a problem---we'd need to see if that was connected to the Johnson844LovaszThetaFMatSDP test. It might not be, and it might not actually be causing a problem.

hppa: https://buildd.debian.org/status/fetch.php?pkg=ensmallen&arch=hppa&ver=2.11.1-1&stamp=1577964053&raw=0 I'm not going to copy paste these errors. I see some things like _ZN4arma8arma_rng5randnIdE4fillEPdj._omp_fn.0' referenced in section ...... maybe disabling OpenMP here might be helpful?

hurd-i386: https://buildd.debian.org/status/fetch.php?pkg=ensmallen&arch=hurd-i386&ver=2.11.1-1&stamp=1577963591&raw=0 Looks like the same as i386.

barak commented 4 years ago

I'd say not worth worrying about the non-release architectures like HPPA. At least, not until release architectures are working.

My suspicion is that the armel timeout issue is an actual test hang due to a bug related to the enormous spate of alignment warnings and such spewed out by the compiler.

Should I set CTEST_OUTPUT_ON_FAILURE=1 in the build scripts? Unless it's crazy voluminous, I can do that and maybe we'll find it easier to get a handle on things.

rcurtin commented 4 years ago

If you don't use ctest, you can get better output:

$ ./ensmallen_tests -d yes
ensmallen version: 2.10.4 (Fried Chicken)
armadillo version: 9.800.1 (Horizon Scraper)
4.609 s: SimpleAdaDeltaTestFunction
0.060 s: AdaDeltaLogisticRegressionTest
0.683 s: SimpleAdaDeltaTestFunctionFMat
0.068 s: AdaDeltaLogisticRegressionTestFMat
3.612 s: SimpleAdaGradTestFunction
2.491 s: AdaGradLogisticRegressionTest
3.651 s: SimpleAdaGradTestFunctionFMat
1.327 s: AdaGradLogisticRegressionTestFMat
0.000 s: AdamSphereFunctionTest
0.000 s: AdamSphereFunctionTestFMat
0.001 s: AdamSphereFunctionTestSpMat
0.000 s: AdamSphereFunctionTestSpMatDenseGradient
0.000 s: AdamStyblinskiTangFunctionTest
0.000 s: AdamMcCormickFunctionTest
0.000 s: AdamMatyasFunctionTest
0.000 s: AdamEasomFunctionTest
0.001 s: AdamBoothFunctionTest
0.427 s: SimpleAdamTestFunction
0.458 s: SimpleAdaMaxTestFunction
0.495 s: SimpleAMSGradTestFunction
0.055 s: AMSGradSphereFunctionTestFMat
0.562 s: AMSGradSphereFunctionTestSpMat
0.086 s: AMSGradSphereFunctionTestSpMatDenseGradient
0.063 s: AdamLogisticRegressionTest
11.835 s: AdaMaxLogisticRegressionTest
1.297 s: AMSGradLogisticRegressionTest
0.602 s: SimpleNadamTestFunction
0.061 s: NadamLogisticRegressionTest
0.386 s: SimpleNadaMaxTestFunction

The order is deterministic, so if it hangs, we can track down pretty easily which one it is.

barak commented 4 years ago

Okay, just uploaded something with that in debian/rules.

barak commented 4 years ago

Okay, there's an i386 build failure with more info on https://buildd.debian.org/status/package.php?p=ensmallen, enjoy!

...

8.302 s: KatyushaProximalLogisticRegressionSpMatTest
0.000 s: RosenbrockFunctionTest

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ensmallen_tests is a Catch v2.4.1 host application.
Run with -? for options

-------------------------------------------------------------------------------
RosenbrockFunctionFloatTest
-------------------------------------------------------------------------------
/<<PKGBUILDDIR>>/tests/lbfgs_test.cpp:41
...............................................................................

/<<PKGBUILDDIR>>/tests/lbfgs_test.cpp:54: FAILED:
  REQUIRE( coords(1) == Approx(1.0).epsilon(1e-7) )
with expansion:
  1.0f == Approx( 1.0 )

0.000 s: RosenbrockFunctionFloatTest
0.000 s: RosenbrockFunctionSpGradTest
0.000 s: RosenbrockFunctionSpMatTest
...

0.928 s: LookaheadAdamSimpleSGDTestFunctionFloat
2.084 s: Johnson844LovaszThetaSDP
Parameter 4 to routine SLASCL was incorrect
Parameter 5 to routine SLASCL was incorrect
-------------------------------------------------------------------------------
Johnson844LovaszThetaFMatSDP
-------------------------------------------------------------------------------
/<<PKGBUILDDIR>>/tests/lrsdp_test.cpp:136
...............................................................................

/<<PKGBUILDDIR>>/tests/lrsdp_test.cpp:161: FAILED:
  REQUIRE( finalValue == Approx(-14.0).epsilon(0.1) )
with expansion:
  nanf == Approx( -14.0 )

0.000 s: Johnson844LovaszThetaFMatSDP
3.405 s: ErdosRenyiRandomGraphMaxCutSDP
...

0.017 s: WNGradStyblinskiTangFunctionSpMatTest
===============================================================================
test cases:   279 |   277 passed | 2 failed
assertions: 11309 | 11307 passed | 2 failed
rcurtin commented 4 years ago

I can't seem to reproduce the Johnson844LovaszThetaFMatSDP issue on an i386/debian:unstable Docker container---has anyone else been able to reproduce that in an easy-to-work with environment? I wonder if it has to do with versions of dependencies or something, but I would expect that I was doing the same thing as the Debian build.

If we can reproduce it then we can open an easy issue to work on. Actually, I guess, we could open issue even if we can't reproduce it, but if we can reproduce it, we significantly increase the probability that it will be solved. :)

barak commented 4 years ago

The autobuilder blew its cookies in exactly the same way on both linux-i386 and hurd-i386, for whatever that's worth.

rcurtin commented 4 years ago

Is there a way that I can reproduce the autobuilder environment exactly for either of those cases?

barak commented 4 years ago

There's /supposed/ to be sufficient information in the logs to know the precise version (in the debian sense, so that means a unique binary package) of everything, and an exhaustive list of all installed packages. So in theory, you should be able to go into a chroot environment and install exactly the same stuff.

mlpack-bot[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:

barak commented 4 years ago

Just to put it out there, this issue is blocking progression of the package in Debian towards the release, which is in turn blocking the mlpack library's progression. Open to any advice: I could disable testing, I could restrict the architecture to a whitelist, I could blacklist some problematic architectures. But the fact that plain old i386 is having problems makes me loath to do any of that, at least without some better justification than "um, it has a mysterious heisen-build-failure".

rcurtin commented 4 years ago

I'd love to fix it but I don't have enough information to reproduce---I don't have time to read through the Debian docs to figure out exactly how to reproduce the environment (I'd love to, really! There is just too much other stuff). And I can't reproduce the issue otherwise on i386 containers, even with different random seeds.

Honestly? I'd consider just disabling that particular test with ensmallen_tests ~Johnson844LovaszThetaFMatTest (I think that's right).

If you can get me a quick and easy way to reproduce the failure in an interactive environment so I can debug it, I can definitely get more information. :+1:

yurivict commented 4 years ago

Tests currently pass on FreeBSD fine.

mlpack-bot[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any recent activity. It will be closed in 7 days if no further activity occurs. Thank you for your contributions! :+1:

barak commented 4 years ago

Okay I'll try to upload see what happens!

barak commented 4 years ago

Gah! Uploaded 2.11.5, see https://buildd.debian.org/status/package.php?p=ensmallen

...
0.000 s: LookaheadAdamSphereFunctionTest
0.434 s: LookaheadAdamSimpleSGDTestFunction
0.000 s: LookaheadAdaGradSphereFunction
0.129 s: LookaheadAdamLogisticRegressionTest
0.712 s: LookaheadAdamSimpleSGDTestFunctionFloat
1.598 s: Johnson844LovaszThetaSDP
Parameter 4 to routine SLASCL was incorrect
Parameter 5 to routine SLASCL was incorrect

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
ensmallen_tests is a Catch v2.4.1 host application.
Run with -? for options

-------------------------------------------------------------------------------
Johnson844LovaszThetaFMatSDP
-------------------------------------------------------------------------------
/<<PKGBUILDDIR>>/tests/lrsdp_test.cpp:136
...............................................................................

/<<PKGBUILDDIR>>/tests/lrsdp_test.cpp:161: FAILED:
  REQUIRE( finalValue == Approx(-14.0).epsilon(0.1) )
with expansion:
  nanf == Approx( -14.0 )

0.000 s: Johnson844LovaszThetaFMatSDP
2.692 s: ErdosRenyiRandomGraphMaxCutSDP
15.001 s: GaussianMatrixSensingSDP
4.566 s: MomentumSGDSpeedUpTestFunction
1.016 s: MomentumSGDGeneralizedRosenbrockTest
...
0.012 s: WNGradStyblinskiTangFunctionSpMatTest
===============================================================================
test cases:   283 |   282 passed | 1 failed
assertions: 11139 | 11138 passed | 1 failed

make[1]: *** [debian/rules:24: override_dh_auto_test] Error 1
...
rcurtin commented 4 years ago

So, I get that this test is failing and I would be happy to work on the issue, except I can't reproduce it. My suggestion would be to not hold up the package on it, and either disable the i386 architecture (probably overkill), or simply disable that failing test. I have a strong feeling it's not indicative of anything that's actually wrong. What do you think?

barak commented 4 years ago

I'm going to set it to still run the tests, but to not abort the build if they fail on i386. That way we can still see any test failures, but we'll get a built package and can try to diagnose the issue more directly, like using the package as built in a chroot and such.

rcurtin commented 4 years ago

:+1: that sounds good to me. I'm sorry that I can't be more helpful on this (I hate not being able to solve an issue). And yeah, if for instance I could use the package in a chroot or something, perhaps I could be able to reproduce the failure. At least it is worth a shot. :)

barak commented 4 years ago

https://buildd.debian.org/status/package.php?p=ensmallen shows the hurd-i386 build succeeding, the log shows a failed test, so soon we'll have a linux-i386 build and can test it on any chroot debian i386. If you have a debian box, I can walk you through setting up an appropriate chroot - it's pretty much just a matter of installing the right packages, one command to create the chroot, and another to jump into it. Would probably also work on Ubuntu.

rcurtin commented 4 years ago

Yeah, I'm a Debian user, so if you have a handful of commands I can try, I'd be happy to give it a shot. :+1:

barak commented 4 years ago

In a Nutshell

sudo apt install cowbuilder git-buildpackage
env ARCH=i386 git-pbuilder create

now you're set up to

env ARCH=i386 git pbuilder login

might want to periodically update, if you want the latest compiler etc

env ARCH=i386 git pbuilder update

For more details, and a laundry list of ways to speed up particular usages, set up caching, etc, see https://wiki.debian.org/git-pbuilder

barak commented 4 years ago

Okay, I hot-wired the build scripts to ignore test errors on i386. Well now they're popping up on armel, armhf, and mipsel as well. This is for 2.12.1, see https://buildd.debian.org/status/package.php?p=ensmallen

rcurtin commented 4 years ago

Hi @barak, thank you so much for those directions. I know it took a long time... but I finally got around to using them! And I was able to (after some fighting) reproduce the issue, and then debug it. I opened #217, and once that is merged, I believe that this issue will be fixed and you can remove the bit of code in packaging that ignores the failed tests. :)

Do you have an easy way to test with the changes from #217 on the various Debian architectures? If you can do that, then we can be sure that the issue is solved before we merge. :+1:

Thanks for the patience! The gears may turn slowly... but they do still turn... :)

barak commented 4 years ago

It's possible to test on architectures w/o uploading a new version, but it's a hassle. I find it easier to just toss it up on the wall and see if it sticks.

rcurtin commented 4 years ago

No worries---I already tested using the environment created by pbuilder, so I think at least the i386 issue should be fixed. It may or may not fix everything else; we'll see. If there are more failures, I'll just address them in follow-up PRs. :+1:

barak commented 4 years ago

I removed the ignore-testing-error stuff and it seems to build okay everywhere!

rcurtin commented 4 years ago

Awesome! I think we can go ahead and close this issue then. If you have further issues, please, feel free to open another issue. :)

rcurtin commented 4 years ago

The thought processes are not moving as fast as the fingers. I hit the buttom where "close and comment" usually is, without noticing that the issue was already closed. So now I'll re-close it. :smile:

barak commented 4 years ago

Yeah, that's why cars have a single pedal which controls the gas or the brakes on alternate depressions. Also weapons systems have a half red / half green button labeled "initiate / cancel global thermonuclear strike (🙭/🙯)".