HaoZeke / OpenBLAS

OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
http://www.openblas.net
BSD 3-Clause "New" or "Revised" License
0 stars 0 forks source link

`except` fix #1

Closed mtsokol closed 2 months ago

mtsokol commented 2 months ago

Hi @HaoZeke,

This is an attempt to fix an issue described in https://github.com/HaoZeke/openblas_buildsys_snips/pull/7.

For some of the ssyrk variants, e.g. UT, the returned results are incorrect. One issue that I spotted is that in driver/level3/meson.build the except entry isn't handled properly (fixed in this PR).

As a result all except entries in ext_mappings dict were ignored. As a result, syrk functions used original flags defined in ext_mappings instead of those in ext_mappings_l3.

This caused lower-nontranspose to pick these _LN flags (again, except was ignored):

'_LN': {'def': ['LEFT'], 'undef': ['TRANSA'],
        'except': ['?syrk', '?syrk_thread',
                   '?syr2k', '?herk', '?herk_kernel',
                  '?trsm_kernel']},

As a result ssyrk_LN was missing -DLOWER flag and had -DLEFT instead.

But why the error occurred in ssyrk_UT? I suspect that it's because the tests were run with NumPy, which is row-major, where the BLAS is Fortran column-major. So upper-transposed became lower-nontransposed (Therefore missing -DLOWER flag for ssyrk_LN could be the reason of the issue).

Please check if my fix solves it for your setup (I tested it with a .c program instead of NumPy tests).

P.S. I also think that except entries in ext_mappings_l3 are ignored so I removed them.

HaoZeke commented 2 months ago

Minor nit, could you prefix the commit with BUG:?