Closed adampauls closed 2 years ago
Hmm, I suppose this could be caused by cancellation in issues in more complex matrix multiplication algorithms that I've never really understood. Is this just expected behavior of the underlying native libraries?
I should add that this is a regression: it does not happen on 1.2, but happens on 1.3 and 2.0.
I can't repro on linux, and I don't have a macos dev environment these days...
This is what I get:
-2.9802322E-8 0.0 -2.2351742E-8 1.4901161E-8
0.0 1.1920929E-7 0.0 0.0
-2.2351742E-8 0.0 0.0 1.1920929E-7
1.4901161E-8 0.0 1.1920929E-7 0.0
Given that this is in 1.3 and 2.0, I suspect this is an issue with the new netlib library we're using. IIRC basically the only change from 1.2 to 1.3 was the switch to the new netlib library. (Tagging @luhenry so it's on his radar.)
Can you see if you can repro on Linux (with CI or whatever)? If you can't repro on Linux we can proceed to minimizing it to a sequence of BLAS calls...
A few new bits of information: • This bug is probably relevant https://github.com/luhenry/netlib/issues/6. • The native libraries are not loading:
Nov 11, 2021 8:00:10 AM dev.ludovic.netlib.InstanceBuilder$NativeBLAS getInstanceImpl
WARNING: Failed to load implementation from:dev.ludovic.netlib.blas.JNIBLAS
Nov 11, 2021 8:00:10 AM dev.ludovic.netlib.InstanceBuilder$NativeBLAS getInstanceImpl
WARNING: Failed to load implementation from:dev.ludovic.netlib.blas.ForeignLinkerBLAS
Nov 11, 2021 8:00:10 AM dev.ludovic.netlib.InstanceBuilder$BLAS getInstanceImpl
WARNING: Failed to load implementation from:dev.ludovic.netlib.NativeBLAS
Nov 11, 2021 8:00:10 AM dev.ludovic.netlib.InstanceBuilder$JavaBLAS getInstanceImpl
WARNING: Failed to load implementation from:dev.ludovic.netlib.blas.VectorBLAS
• A colleague was able to reproduce on windows, though for him the VectorBLAS warning did not appear. I tried making sure that VectorBLAS was loading and it also did not fix the issue.
So I think this points to a bug in the Java implementations of some of the BLAS routines? I'll post an issue in netlib.
This is definitely a bug in netlib: https://github.com/luhenry/netlib/issues/7
This should be fixed in https://github.com/luhenry/netlib/releases/tag/v2.2.1. I've just pushed the release through Sonatype, so it should show up in Maven repositories in the next hour or so at https://repo1.maven.org/maven2/dev/ludovic/netlib/blas/2.2.1/
thanks @luhenry! i'll push an RC tomorrow then. @adampauls do you have a preference for which version i do first? (which might be the only one I do tomorrow), say nothing and I'll do Breeze 2.0
On Sun, Nov 14, 2021 at 11:08 PM Ludovic Henry @.***> wrote:
This should be fixed in https://github.com/luhenry/netlib/releases/tag/v2.2.1. I've just pushed the release through Sonatype, so it should show up in Maven repositories in the next hour or so.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/scalanlp/breeze/issues/823#issuecomment-968596093, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAACLIO3IVS4CTKJOH74EH3UMCWWXANCNFSM5HTEEOGA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
2.0.1-RC1 is being pushed to maven now, lemme know if it fixes it
I'm still seeing the first 3 of the warnings above on 2.0.1-RC1 (not sure about the precision loss). I don't see the VectorBLAS warning. I am on Mac OS.
The warning just means you're not getting natives, so degraded performance, but it shouldn't mean anything for the precision thing. Not sure what's going on though
After upgrading to Breeze 2.0, we have found some strange issues where we need to switch from
Float
toDouble
to get the right accuracy where we really shouldn't. Here is a somewhat minimized example:As far as I can tell, entry (2, 2) in
corrMatrixFloat
is just entirely wrong in a way that should not happen because of precision issues.I'm on a 2019 Macbook Pro.