ngnrsaa / qflex

Flexible Quantum Circuit Simulator (qFlex) implements an efficient tensor network, CPU-based simulator of large quantum circuits.
Apache License 2.0
97 stars 24 forks source link

CBLAS API usage fix #272

Closed DmitryLyakh closed 4 years ago

DmitryLyakh commented 4 years ago

Apparently on a Linux machine one can have multiple header files called cblas.h, which are coming from different packages. This fix is intended to make sure you are using cblas.h from OpenBLAS. The reason the code built before is because it was finding a different cblas.h header not from OpenBLAS.

s-mandra commented 4 years ago

The reason the code built before is because it was finding a different cblas.h header not from OpenBLAS.

It sounds strange because the code compiles in docker containers where the only installed cblas.h comes from OpenBLAS. So the code per se compiles for OpenBLAS.

DmitryLyakh commented 4 years ago

For the qFlex code built before this fix, can you take a look at that specific cblas.h header the build is picking up. In particular, the cblas_cgemm function prototype in the OpenBLAS cblas.h header is using float parameters whereas the type in qFlex/src/tensor.cpp is s_type, which is complex_float*. It shouldn't be able to do an implicit conversion here. Can you share the cblas_cgemm function prototype that you see in your cblas.h (from Docker)?

DmitryLyakh commented 4 years ago

It is also possible that OpenBLAS updated cgemm etc prototypes in its cblas.h header, so different version may be incompatible.

DmitryLyakh commented 4 years ago

Yep, the new OpenBLAS version has void*, so it sinks the pointer type implicitly. That's why it works with the newer version, but does not with an older one.

DmitryLyakh commented 4 years ago

In any case, with this fix it will work for both older and newer versions of OpenBLAS.

s-mandra commented 4 years ago

Yep, the new OpenBLAS version has void*, so it sinks the pointer type implicitly. That's why it works with the newer version, but does not with an older one.

It makes sense. Thanks for looking into this!

s-mandra commented 4 years ago

Could you apply the following patch to fix the format? Thanks!

diff --git a/src/tensor.cpp b/src/tensor.cpp
index d7c70ad..46e17f5 100644
--- a/src/tensor.cpp
+++ b/src/tensor.cpp
@@ -960,12 +960,13 @@ void _multiply_MM(const s_type* A_data, const s_type* B_data, s_type* C_data,
   }
   s_type alpha = 1.0;
   s_type beta = 0.0;
-  cblas_cgemm(CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k,
-              reinterpret_cast<const s_type::value_type*>(&alpha),
-              reinterpret_cast<const s_type::value_type*>(A_data), std::max(1ul, k),
-              reinterpret_cast<const s_type::value_type*>(B_data), std::max(1ul, n),
-              reinterpret_cast<const s_type::value_type*>(&beta),
-              reinterpret_cast<s_type::value_type*>(C_data), std::max(1ul, n));
+  cblas_cgemm(
+      CblasRowMajor, CblasNoTrans, CblasNoTrans, m, n, k,
+      reinterpret_cast<const s_type::value_type*>(&alpha),
+      reinterpret_cast<const s_type::value_type*>(A_data), std::max(1ul, k),
+      reinterpret_cast<const s_type::value_type*>(B_data), std::max(1ul, n),
+      reinterpret_cast<const s_type::value_type*>(&beta),
+      reinterpret_cast<s_type::value_type*>(C_data), std::max(1ul, n));
 }

 void _multiply_Mv(const s_type* A_data, const s_type* B_data, s_type* C_data,
@@ -983,7 +984,8 @@ void _multiply_Mv(const s_type* A_data, const s_type* B_data, s_type* C_data,
   s_type beta = 0.0;
   cblas_cgemv(CblasRowMajor, CblasNoTrans, m, k,
               reinterpret_cast<const s_type::value_type*>(&alpha),
-              reinterpret_cast<const s_type::value_type*>(A_data), std::max(1ul, k),
+              reinterpret_cast<const s_type::value_type*>(A_data),
+              std::max(1ul, k),
               reinterpret_cast<const s_type::value_type*>(B_data), 1,
               reinterpret_cast<const s_type::value_type*>(&beta),
               reinterpret_cast<s_type::value_type*>(C_data), 1);
@@ -1005,7 +1007,8 @@ void _multiply_vM(const s_type* A_data, const s_type* B_data, s_type* C_data,
   s_type beta = 0.0;
   cblas_cgemv(CblasRowMajor, CblasTrans, k, n,
               reinterpret_cast<const s_type::value_type*>(&alpha),
-              reinterpret_cast<const s_type::value_type*>(A_data), std::max(1ul, n),
+              reinterpret_cast<const s_type::value_type*>(A_data),
+              std::max(1ul, n),
               reinterpret_cast<const s_type::value_type*>(B_data), 1,
               reinterpret_cast<const s_type::value_type*>(&beta),
               reinterpret_cast<s_type::value_type*>(C_data), 1);
DmitryLyakh commented 4 years ago

In the current version of the code, regardless float or s_type::value_type, it will only work for s_type=complex_float. If s_type changes to complex_double, the CBLAS API will produce wrong answers as they are hardcoded as cgemm, cgemv, etc, assuming complex_float.