taichi-dev / taichi

Productive, portable, and performant GPU programming in Python.
https://taichi-lang.org
Apache License 2.0
25.51k stars 2.28k forks source link

Structs and StructFields: passing them into and returning them from kernels #8441

Open Datamance opened 10 months ago

Datamance commented 10 months ago

Hi,

Let's say I create a StructField like so:

import taichi as ti
import torch

ti.init(arch=ti.gpu)

@ti.dataclass
class Sphere:
    center: ti.math.vec3
    radius: float

sphere_field = Sphere.field(shape=(20, 20), name="<SphereField>")

sphere_field.from_torch({"center": torch.rand(20, 20, 3), "radius": torch.rand(20, 20)})

With the latest release of taichi (1.7.0), I still cannot pass this object (by reference or value/copy) into a kernel directly, as a parameter. Indeed, there seems to be nothing in taichi.types in the way of annotating StructFields. Instead, I have to treat it as a global (compile-time) variable and rely on a closure to bring it into the body of the kernel. The other thing we can do is initialize a @taichi.data_oriented class with a StructField member, and have the kernel method pull that field from self.

While the second option is acceptable for an object-oriented style of programming, the inability to decouple a top-level kernel from the global scope renders explicitly functional/pure programming pretty much impossible. This is not ideal from a clean coding/architecture standpoint.

This seems to be something that a lot of people want (see here, also the project backlog with "struct" as a query yields a bunch of issues that are variations on this request). Is this the sort of thing we can expect in the next version (1.8.0) of taichi? If so, where can we track progress?

Thanks again for a making such a great library!

oliver-batchelor commented 10 months ago

Whilst I definitely agree, this would be awesome. What I do for this currently is to pack my structures into vectors, and then write some little helper functions to convert.

So for a sphere I'd do something like this:

@ti.func
def unpack_sphere(v: ti.math.vec4):
   return Sphere(v[0:3], v[4])

@ti.func
def pack_sphere(s: Sphere):
  return ti.math.vec4(*sphere.centre, sphere.radius])

# taichi structs don't let you have static members, but you can just add them afterwards!
Sphere.pack = pack_sphere
Sphere.unpack = unpack_sphere

@kernel
def translate_spheres(spheres: ti.types.ndarray(ti.math.vec4, ndim=1), translation:ti.math.vec3):
    for i in range(spheres.shape[0]):
      s = Sphere.unpack(spheres[i])
     s.centre += translation 

     spheres[i] = Sphere.pack(s)

For a trivial example like this it seems like unnecessary boilerplate - but used in larger pieces of code it seems a lot more reasonable.

Datamance commented 10 months ago

@oliver-batchelor interesting workaround, but would there be any performance impact from marshaling bits and deserializing structs like this inside the kernel, or would that get optimized away by the compiler? If not, I think you're better off (performance-wise) with a static StructField.

Also, I'm seeing now from your example that taichi dataclasses (which just provide a thin wrapper around structs, from my understanding) don't support things like @staticmethod or @classmethod decorators. These would be nice to have for sure, but it's not a "show-stopper" for me at the moment.

Datamance commented 10 months ago

The other thing I considered was argpack types, but given the inability to pass a field/vector of argpacks, you end up in a situation that just recreates passing disjoint fields into kernels. This pretty much defeats the purpose of argpacks and makes it more cumbersome to write "fat kernels", which is a major selling point for Taichi otherwise. Maybe I'm just missing something obvious, but I can't think of a clean way to deal with semantically heterogeneous data that doesn't require use of globally-scoped data containers. Even compiler support for destructured assignment would be nice. Without that (unless you want to sacrifice performance) it seems that you're limited to index-fiddling and/or excessive boilerplate.

bobcao3 commented 10 months ago

Fields are global and just can't be "passed" into the kernel... The only way to do it is to use @ti.template. On the other hand I'm not sure whether struct ndarray is a thing or not, if so that'd be the solution

oliver-batchelor commented 10 months ago

@oliver-batchelor interesting workaround, but would there be any performance impact from marshaling bits and deserializing structs like this inside the kernel, or would that get optimized away by the compiler? If not, I think you're better off (performance-wise) with a static StructField.

Also, I'm seeing now from your example that taichi dataclasses (which just provide a thin wrapper around structs, from my understanding) don't support things like @staticmethod or @classmethod decorators. These would be nice to have for sure, but it's not a "show-stopper" for me at the moment.

I haven't seen any bad performance from doing this - if anything it seems better than the alternative which is to pass "Struct of Array" style in ndarrays, where you pass in an array corresponding to every field. Clearly it doesn't work with structs which have a whole lot of mixed data types so well, but most of mine are simple geometric types which are OK.

I've been refactoring taichi_3d_gaussian_splatting into this style a bit here if you want to see what it looks like, though some other applications will have more demanding usage.

Datamance commented 10 months ago

Fields are global and just can't be "passed" into the kernel... The only way to do it is to use @ti.template. On the other hand I'm not sure whether struct ndarray is a thing or not, if so that'd be the solution

You're totally right - I've been relying on ndarray so heavily that I somehow completely missed the fact that fields can only be treated as global.

Out of curiosity: why are the two constructs treated with this distinction? Why allow users to pass in ndarrays by reference, but require them to treat fields as global?

bobcao3 commented 10 months ago

Fields are global and just can't be "passed" into the kernel... The only way to do it is to use @ti.template. On the other hand I'm not sure whether struct ndarray is a thing or not, if so that'd be the solution

You're totally right - I've been relying on ndarray so heavily that I somehow completely missed the fact that fields can only be treated as global.

Out of curiosity: why are the two constructs treated with this distinction? Why allow users to pass in ndarrays by reference, but require them to treat fields as global?

Fields are an older design, and because the advanced layout it allows requires some complex GPU runtime and memory management that's non-transparent and always there, it makes it hard to make a reference out of them... actually for the CUDA backend especially we practically don't know the address of a field on the host other than asking the GPU runtime itself