bwlewis / irlba

Fast truncated singular value decompositions
127 stars 17 forks source link

"pathological example" fails on i386 architecture #33

Open ginggs opened 6 years ago

ginggs commented 6 years ago

In case it helps, printing l$d at the point of failure produces the following on i386:

 [1] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
 [8] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.9999999 0.9999996
[15] 0.9999978 0.9999895 0.9999511 0.9997809 0.9990578 0.9961445

...and the following on all other architectures (x86_64, ARM, POWER, etc.):

 [1] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
 [8] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.9999999
[15] 0.9999996 0.9999978 0.9999895 0.9999511 0.9997809 0.9990578

I think this may be caused by i386 using 80-bit floating point precision internally, while other architectures use 64-bit.

Is it possible to adjust this test so that it would pass under both conditions?

bwlewis commented 6 years ago

Thanks for finding this.

Can you tell me the exact CPU model and also which blas library you've got R linked to?

In the short term though, I propose simply disabling this test on 32-bit systems...does that seem reasonable?

On 3/20/18, Graham Inggs notifications@github.com wrote:

In case it helps, printing l$d at the point of failure produces the following on i386:

 [1] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
 [8] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.9999999 0.9999996
[15] 0.9999978 0.9999895 0.9999511 0.9997809 0.9990578 0.9961445

...and the following on all other architectures (x86_64, ARM, POWER, etc.):

 [1] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000
 [8] 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 1.0000000 0.9999999
[15] 0.9999996 0.9999978 0.9999895 0.9999511 0.9997809 0.9990578

I think this may be caused by i386 using 80-bit floating point precision internally, while other architectures use 64-bit.

Is it possible to adjust this test so that it would pass under both conditions?

-- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/bwlewis/irlba/issues/33

ginggs commented 6 years ago

The CPU is an Intel Core i7-2600 running the 32-bit version of Ubuntu 18.04. The test passes on the same CPU running the 64-bit version. I believe that R is linked to OpenBLAS is 0.2.20.

In case it is helpful, you can click on the links on the following page to see passing and failing test logs on the various architectures tested in Ubuntu: http://autopkgtest.ubuntu.com/packages/r-cran-irlba

The test does pass on ARM 32-bit systems, so I'd rather not disable the test there if possible.

ginggs commented 6 years ago

Lowering the precision of the test as below worked for me, but is the test still meaningful with this change?

--- a/tests/test.R
+++ b/tests/test.R
@@ -148,7 +148,7 @@
   x <- tprolate(512)
   set.seed(1)
   l <- irlba(x, nv=20, fastpath=FAST)
-  if (isTRUE(max(abs(l$d - 1)) > 1e-3))
+  if (isTRUE(max(abs(l$d - 1)) > 1e-2))
   {
     stop("Failed tprolate test fastpath=", FAST)
   }
bwlewis commented 6 years ago

Is it possible for you to check the latest version 2.3.2 that's on cran? Is that also failing?

On 3/20/18, Graham Inggs notifications@github.com wrote:

Lowering the precision of the test as below worked for me, but is the test still meaningful with this change?

--- a/tests/test.R
+++ b/tests/test.R
@@ -148,7 +148,7 @@
   x <- tprolate(512)
   set.seed(1)
   l <- irlba(x, nv=20, fastpath=FAST)
-  if (isTRUE(max(abs(l$d - 1)) > 1e-3))
+  if (isTRUE(max(abs(l$d - 1)) > 1e-2))
   {
     stop("Failed tprolate test fastpath=", FAST)
   }

-- You are receiving this because you commented. Reply to this email directly or view it on GitHub: https://github.com/bwlewis/irlba/issues/33#issuecomment-374608074

bwlewis commented 6 years ago

Aha, I see it's failing there with Intel MKL too (2.3.2). Investigating...