lebedov / scikit-cuda

Python interface to GPU-powered libraries
http://scikit-cuda.readthedocs.org/
Other
986 stars 179 forks source link

gpuarray fail to transpose and flatten after cublas.Sgemm #325

Open SuperbTUM opened 2 years ago

SuperbTUM commented 2 years ago

Problem

I am using Colab to do GPU programming. I came up with a problem when I am using matrix multiplication function Sgemm. The three inputs of the function would be gpuarrays. Once I got the result, I tried to transform the result gpuarray (which is a flatten one) by first transposing (.T) and second flattening (ravel()). I think this could get a matrix that is completely different from the raw output since I rearrange the order. But the truth is that I still got the identical matrix. Therefore, I want to know how to rearrange the order of gpuarray. Thanks in advance.

Environment

I think the environment is Linux 5.4.104+ and Python version is 3.7.12. CUDA is integrated in Colab with version Cuda 11.1. Pycuda version is 2021.1, and scikit-cuda version is 0.5.3.