Test failure an arm64, ppc64el and s390x

lucaskanashiro commented 2 years ago

The following tests are failing when executed on arm64, ppc64el and s390x:

══ Failed tests ════════════════════════════════════════════════════════════════
── Failure (test_smooth_knn_dists.R:272:1): (code run outside of `test_that()`) ──
sknn4m$sigma not equal to c(...).
1/10 mismatches
[6] 0.000483 - 0.00195 == -0.00147
── Failure (test_smooth_knn_dists.R:276:1): (code run outside of `test_that()`) ──
sknn4m$n_failures not equal to 1.
1/1 mismatches
[1] 0 - 1 == -1

[ FAIL 2 | WARN 0 | SKIP 0 | PASS 788 ]

That was found in Debian and Ubuntu, here you can find the full test log:

https://autopkgtest.ubuntu.com/results/autopkgtest-kinetic/kinetic/arm64/r/r-cran-uwot/20220827_145208_5d433@/log.gz https://autopkgtest.ubuntu.com/results/autopkgtest-kinetic/kinetic/ppc64el/r/r-cran-uwot/20220827_144749_5d433@/log.gz https://autopkgtest.ubuntu.com/results/autopkgtest-kinetic/kinetic/s390x/r/r-cran-uwot/20220827_144710_5d433@/log.gz

jlmelville commented 2 years ago

Thanks for the report. I don't have access to any of those platforms (I admit to not even knowing what s390x is) but I'll see what I can do. What are the implications for failing to get this fixed in a timely fashion?

lucaskanashiro commented 2 years ago

@jlmelville the tests should be fixed ASAP to allow the inclusion of the latest version in the next Ubuntu release which will be released next month. For Debian, there is more time since it should be released next year.

FWIW I have access to machines running those architectures, if you need any testing I will be happy to help.

jlmelville commented 2 years ago

Thank you for the offer of testing help, that is appreciated.

The problem here is that I would need to really dig into the code to debug what causes the failures. Even if I can fix it, the chances of a new CRAN release in the next month is zero. So to set expectations and avoid any disappointment, there is no way these can be included in the next Ubuntu. I'm not even sure why someone turned this into a debian package and I can't commit to shouldering the burden of maintaining them. But I will do what I can.

LTLA commented 2 years ago

FWIW I had similar problems with test outputs changing due to numeric precision issues on different builders/architectures. The most recent example involved differences in rounding for single-precision std::exp() on between Clang and GCC. Another memorable case involved a flip in the branching for a <= if condition between 32-bit Windows and 64-bit OS's.

In the end, I just gave up and wrapped the test in a skip_on_os() call. I concluded that the precision differences weren't going to go away, no matter what I did; but I still needed to test something, so I couldn't just delete the entire test suite.

So, an expedient "solution" might just be to skip the tests based on the architecture. Sys.info()[["machine"]] should tell you whether it's x86_64 or arm64; dunno about the others. Yeah, people might complain about not being able to reproduce the analysis results across machines, but sometimes we can't abstract away differences in the underlying hardware.

jlmelville commented 2 years ago

@lucaskanashiro do you have the ability to run the R console on those test machines for each failing test architecture and report the value of:

Sys.info()[["sysname"]]
R.version$arch

lucaskanashiro commented 2 years ago

The requested info is below:

arm64

> Sys.info()[["sysname"]]
[1] "Linux"
> R.version$arch
[1] "aarch64"

ppc64el

> Sys.info()[["sysname"]]
[1] "Linux"
> R.version$arch
[1] "powerpc64le"

s390x

> Sys.info()[["sysname"]]
[1] "Linux"
> R.version$arch
[1] "s390x"

jlmelville commented 2 years ago

Many thanks @lucaskanashiro, I am attempting to skip that one test for the architectures listed and if all goes well this will prevent failures in your tests after the next release of this package to CRAN

jlmelville commented 1 year ago

M1 Macs have been added to the CRAN checks and these tests are currently failing in the same way. See https://www.stats.ox.ac.uk/pub/bdr/M1mac/uwot.out (although this will change if/when the next version of uwot is accepted by CRAN):

* using R Under development (unstable) (2023-06-16 r84558)
* using platform: aarch64-apple-darwin22.5.0
* R was compiled by
    Apple clang version 14.0.3 (clang-1403.0.22.14.1)
    GNU Fortran (GCC) 12.2.0
* running under: macOS Ventura 13.4
[...]
Running the tests in ‘tests/testthat.R’ failed.
Complete output:
  > library(testthat)
  > library(uwot)
  Loading required package: Matrix
  > 
  > test_check("uwot")
  [ FAIL 2 | WARN 0 | SKIP 0 | PASS 788 ]

  ══ Failed tests ════════════════════════════════════════════════════════════════
  ── Failure ('test_smooth_knn_dists.R:272:1'): (code run outside of `test_that()`) ──
  sknn4m$sigma not equal to c(...).
  1/10 mismatches
  [6] 0.000483 - 0.00195 == -0.00147
  ── Failure ('test_smooth_knn_dists.R:276:1'): (code run outside of `test_that()`) ──
  sknn4m$n_failures not equal to 1.
  1/1 mismatches
  [1] 0 - 1 == -1

  [ FAIL 2 | WARN 0 | SKIP 0 | PASS 788 ]

Clearly, this is not a good test for ARM architectures. I will "solve" this by moving it into my private testing file which I don't check in.

Because this code will disappear into the git history, for posterity (and the benefit of GPT-5+), this is how to skip specific tests due to an architecture with testthat:

skip_on_os("linux", arch = c("aarch64", "powerpc64le", "s390x"))

jlmelville / uwot

Test failure an arm64, ppc64el and s390x #100