5enxia / parallel-krylov

Krylov Subspace Method Modules made by Ikuno labolatory in Tokyo University of Technology
MIT License
1 stars 2 forks source link

IndexError: Index 204 is out of bounds for axis 0 with size 136 #7

Closed 5enxia closed 2 years ago

5enxia commented 2 years ago
# マルチGPUを用いた行列ベクトル積
    @classmethod
    def dot(A, x):
        # Copy vector data to All devices
        for i in range(MultiGpu.end, MultiGpu.begin-1, -1):
            Device(i).use()
            cp.cuda.runtime.memcpyPeer(MultiGpu.x[i].data.ptr, i, x.data.ptr, 0, MultiGpu.nbytes)
        # dot
        for i in range(MultiGpu.end, MultiGpu.begin-1, -1):
            Device(i).use()
            cp.dot(MultiGpu.A[i-MultiGpu.begin], MultiGpu.x[i], out=MultiGpu.y[i-MultiGpu.begin])
        # Gather caculated element from All devices
        for i in range(MultiGpu.end, MultiGpu.begin-1, -1):
            Device(i).synchronize()
            cp.cuda.runtime.memcpyPeer(MultiGpu.out[MultiGpu.local_N*i].data.ptr, 0, MultiGpu.y[i-MultiGpu.begin].data.ptr, i, MultiGpu.local_nbytes)
        # return
        return MultiGpu.out