stmt 881 cannot have operand 872. in pbf2d update_grid

ben441318936 commented 4 years ago

Describe the bug I am trying to create a differentiable PBF simulation based on the pbf2d example. I encountered a runtime error caused by the update_grid kernel when trying to take gradients.

Log [Taichi] mode=release [Taichi] version 0.6.14, llvm 8.0.1, commit c83a7e3d, python 3.7.7 [Taichi] Starting on arch=cuda [E 06/29/20 13:06:53.855] [verify.cpp:taichi::lang::IRVerifier::basic_verify@40] stmt 881 cannot have operand 872.

Taichi Compiler Stack Traceback *

0x7ff93583ac01: taichi::create_instance in taichi_core.pyd 0x7ff9358f8f9a: taichi::create_instance in taichi_core.pyd 0x7ff93567b268: taichi::create_instance in taichi_core.pyd 0x7ff935666b41: taichi::create_instance in taichi_core.pyd 0x7ff93567b6db: taichi::create_instance in taichi_core.pyd 0x7ff93567b6db: taichi::create_instance in taichi_core.pyd 0x7ff93567b4ea: taichi::create_instance in taichi_core.pyd 0x7ff935843c2c: taichi::create_instance in taichi_core.pyd 0x7ff93584f0e8: taichi::create_instance in taichi_core.pyd 0x7ff93576fd8a: taichi::create_instance in taichi_core.pyd 0x7ff935781127: taichi::create_instance in taichi_core.pyd 0x7ff93576f066: taichi::create_instance in taichi_core.pyd 0x7ff93576eedf: taichi::create_instance in taichi_core.pyd 0x7ff9357d92e2: taichi::create_instance in taichi_core.pyd 0x7ff93578b556: taichi::create_instance in taichi_core.pyd 0x7ff9356beccb: taichi::create_instance in taichi_core.pyd 0x7ff950dbfcf3: PyMethodDef_RawFastCallDict in python37.dll 0x7ff950ec1f22: PyCFunction_FastCallDict in python37.dll 0x7ff950e0bf9c: PyErr_NoMemory in python37.dll 0x7ff950db4b79: PyObject_Call in python37.dll 0x7ff950e25fe5: PyErr_NoMemory in python37.dll 0x7ff950dcf7da: PyObject_FastCallKeywords in python37.dll 0x7ff950dd7939: PyMethodDef_RawFastCallKeywords in python37.dll 0x7ff950dd7f1f: PyEval_EvalFrameDefault in python37.dll 0x7ff950dc1286: PyEval_EvalCodeWithName in python37.dll 0x7ff950dc0f0a: PyFunction_FastCallDict in python37.dll 0x7ff950ddfffd: PySlice_New in python37.dll 0x7ff950dd8be4: PyEval_EvalFrameDefault in python37.dll 0x7ff950dc1286: PyEval_EvalCodeWithName in python37.dll 0x7ff950dc0f0a: PyFunction_FastCallDict in python37.dll 0x7ff950de76d0: PyObject_Call_Prepend in python37.dll 0x7ff950de763d: PyType_GetDocFromInternalDoc in python37.dll 0x7ff950ddff74: PySlice_New in python37.dll 0x7ff950dd8be4: PyEval_EvalFrameDefault in python37.dll 0x7ff950dd77db: PyMethodDef_RawFastCallKeywords in python37.dll 0x7ff950dd7e73: PyEval_EvalFrameDefault in python37.dll 0x7ff950dc0e2d: PyFunction_FastCallDict in python37.dll 0x7ff950dbfeba: PyMethodDef_RawFastCallDict in python37.dll 0x7ff950dbf9db: PyObject_FastCallDict in python37.dll 0x7ff950dd88a2: PyEval_EvalFrameDefault in python37.dll 0x7ff950dc1286: PyEval_EvalCodeWithName in python37.dll 0x7ff950da2a93: PyEval_EvalCodeEx in python37.dll 0x7ff950da29f1: PyEval_EvalCode in python37.dll 0x7ff950da299b: PyArena_Free in python37.dll 0x7ff950f3614d: PyRun_FileExFlags in python37.dll 0x7ff950f36974: PyRun_SimpleFileExFlags in python37.dll 0x7ff950f3601b: PyRun_AnyFileExFlags in python37.dll 0x7ff950e81aaf: Py_UnixMain in python37.dll 0x7ff950e81b57: Py_UnixMain in python37.dll 0x7ff950df0d5a: PyErr_NoMemory in python37.dll 0x7ff950d913a5: Py_Main in python37.dll 0x7ff950d91052: Py_Main in python37.dll 0x7ff7cfd91268: Unknown Function in python.exe 0x7ff99f7d6fd4: BaseThreadInitThunk in KERNEL32.DLL 0x7ff9a11fcec1: RtlUserThreadStart in ntdll.dll

To Reproduce

import taichi as ti

ti.init(arch=ti.gpu)

# params
num_particles = 10
grid_size = 3
max_num_particles_per_cell = 10

cell_size = 2.51
cell_recpr = 1.0 / cell_size

# variables
positions = ti.Vector(3, ti.f32)
grid_num_particles = ti.var(ti.i32)
grid2particles = ti.var(ti.i32)
loss = ti.var(ti.f32)

# layout
ti.root.dense(ti.i, num_particles).place(positions)
grid_snode = ti.root.dense(ti.ijk, grid_size)
grid_snode.place(grid_num_particles)
grid_snode.dense(ti.l, max_num_particles_per_cell).place(grid2particles)
ti.root.place(loss)

ti.root.lazy_grad()

@ti.func
def get_cell(pos):
    return ti.cast(pos * cell_recpr, ti.i32)

@ti.kernel
def update_grid():
    for i in range(num_particles):
        cell = get_cell(positions[i])
        # ti.Vector doesn't seem to support unpacking yet
        # but we can directly use int Vectors as indices

        # This does not work (from pbf2d example)
        offs = grid_num_particles[cell].atomic_add(1)
        grid2particles[cell, offs] = i

@ti.kernel
def init():
    for i in range(num_particles):
        for k in ti.static(range(3)):
            positions[i][k] = 1

def forward():
    grid_num_particles.fill(0)
    grid2particles.fill(0)
    update_grid()

@ti.kernel
def compute_loss():
    for i in range(num_particles):
        for k in ti.static(range(3)):
            loss += positions[i][k]

init()
loss[None] = 0

with ti.Tape(loss=loss):
    forward()
    compute_loss()
print("Loss:", loss[None])
print("Grad to initial pos: ", positions.grad[0][0], positions.grad[0][1], positions.grad[0][2])

xumingkuan commented 4 years ago

I found this issue is caused by https://github.com/taichi-dev/taichi/blob/3c633f70d5d65bacc27895630284457f8d1a7609/taichi/transforms/auto_diff.cpp#L688 @yuanming-hu Shall we have a brief chat about this?

yuanming-hu commented 4 years ago

Sorry, I'm too swamped these days. I'll take a look at this sometime in the week of July 13. Sorry about the delay on my end.

xumingkuan commented 4 years ago

OK, I’ll try to figure out a hotfix then.

yuanming-hu commented 4 years ago

@ben441318936 The bad news is that offs = grid_num_particles[cell].atomic_add(1) is not supported in autodiff, and the good news is that you don't need to differentiate update_grid at all since it's all integral operations. Please consider moving update_grid outside ti.Tape().

ben441318936 commented 4 years ago

As a follow-up question, how should I structure the code if I want to do an end-to-end gradient in a PBF simulation? The forward operation in PBF follows: find neighbors using background grid, solve density constraints based on neighbors, update particle positions. This is repeated every step. My understanding is that I should wrap the ti.Tape() around the entire forward operation. Is this not correct?

xumingkuan commented 4 years ago

Please check out https://taichi.readthedocs.io/en/stable/differentiable_programming.html (Global Data Access Rules). I only know that kernels violating such rules cannot be inside ti.Tape()...

taichi-dev / taichi

stmt 881 cannot have operand 872. in pbf2d update_grid #1359