taichi-dev / taichi

Productive, portable, and performant GPU programming in Python.
https://taichi-lang.org
Apache License 2.0
25.54k stars 2.29k forks source link

cache loop invariant global vars pass induces incorrect results in serial mode #8578

Open erizmr opened 3 months ago

erizmr commented 3 months ago

Related issue #8576 Thanks for #8577 from @jim19930609. Based on this PR, there is still an issue in serial mode as shown below

import taichi as ti

ti.init(arch=ti.gpu, print_ir=True, print_ir_dbg_info = False, offline_cache=False, cache_loop_invariant_global_vars=True)

x = ti.field(float, shape=(3, 5))

@ti.kernel
def repro():
    ti.loop_config(serialize=True)
    for i in range(5):
        x[2, i] = x[2, i] + 1.0
        for j in range(1):
            x[2, i] = x[2, i] - 5.0
            print("x value ", x[2, i])
            for z in range(1):
                idx = 0
                if z == 0:
                    idx = 2
                x_print = x[idx, i]
                print("x value inside ", x_print)
                print("x value inside direct access", x[2, i])

repro()
x value  -4.000000
x value inside  1.000000
x value inside direct access -4.000000
x value  -4.000000
x value inside  1.000000
x value inside direct access -4.000000
x value  -4.000000
x value inside  1.000000
x value inside direct access -4.000000
x value  -4.000000
x value inside  1.000000
x value inside direct access -4.000000
x value  -4.000000
x value inside  1.000000
x value inside direct access -4.000000