NVIDIA / warp

A Python framework for high performance GPU simulation and graphics
https://nvidia.github.io/warp/
Other
4.23k stars 242 forks source link

Inconsistent matrix indexing inside/outside a kernel #278

Open Luke-Skycrawler opened 3 months ago

Luke-Skycrawler commented 3 months ago

Bug Description

It seems we get different matrices depending on whether the code is inside a kernel if we construct the matrix by vectors.

import warp as wp

wp.init()

a = wp.mat33(wp.vec3(1.0), wp.vec3(2.0),wp.vec3(3.0))
print(a[0])

@wp.kernel
def test():
    A = wp.mat33(wp.vec3(1.0), wp.vec3(2.0),wp.vec3(3.0))
    print(A[0])

wp.launch(test,(1))

We get (1, 1, 1) on the first print but (1,2,3) on the second. The matrix a is constructed by the rows but inside a kernel it's constructed by the columns. It would be better if they run by the same rule.

Warp 1.3.0 initialized:
   CUDA Toolkit 12.5, Driver 12.2
   Devices:
     "cpu"      : "Intel64 Family 6 Model 151 Stepping 2, GenuineIntel"
     "cuda:0"   : "NVIDIA GeForce RTX 3060" (12 GiB, sm_86, mempool enabled)
[1.0, 1.0, 1.0]
Module __main__ c767a68 load on device 'cuda:0' took 0.96 ms  (cached)
1 2 3

System Information

No response

christophercrouzet commented 3 months ago

Hi @Luke-Skycrawler, thanks for reporting this issue and apologies for any inconvenience caused!

The intention is for vectors to be row-major, so the kernel constructor is incorrect.

I'll discuss how to best tackle this issue with the team since addressing it would cause a breaking change.