Open vladimir-ch opened 1 year ago
The same problem will likely trigger for NRM2
. As far as I am aware, the assembly implementations in OpenBLAS have not been updated. I think that changing all assembly implementations is a challenge, so one option may be to add an additional symbol, one for the netlib implementation and one for the old optimized implementation.
Pretty much the only somewhat short-term solution I see is to update the "generic" reimplementations and "temporarily" use them for all architectures/cpus that have hand-crafted assembly now
Small clarification - ?ROTG is easy as it is all done in trivial C code in the interface routine, no cpu-specific kernels involved. NRM2 will require a bit more work as there are around 50 assembly kernels in all (and still about 10 if only looking at the most important ones)
The Reference BLAS changed their
DROTG
implementation in https://github.com/Reference-LAPACK/lapack/pull/527 to use a safe scaling. In Gonum we updated our implementation and tests accordingly in https://github.com/gonum/gonum/issues/1623. Unfortunately, the updated tests with extreme values fail in our CBLAS interface package (https://github.com/gonum/netlib/pull/92) where we use OpenBLAS as a reference. It would be nice if OpenBLAS used the same implementation with safe scaling as the reference BLAS.