taichi-dev / taichi

Productive, portable, and performant GPU programming in Python.
https://taichi-lang.org
Apache License 2.0
25.41k stars 2.27k forks source link

[BUG] unexpected add behavier. #8529

Open Dinngger opened 4 months ago

Dinngger commented 4 months ago

Describe the bug Different add behavier in ti.func and ti.kernel

To Reproduce

import taichi as ti
ti.init(ti.cuda)
v6 = ti.types.vector(6, ti.f32)

@ti.func
def vector_add(a, b):
    for i in ti.static(range(6)):
        a[i] += b[i]
    return a

@ti.kernel
def test_kernel(a: v6, b: v6):
    print(vector_add(a, b))
    print(a)
    for i in ti.static(range(6)):
        a[i] += b[i]
    print(a)

x = ti.Vector([1, 2, 3, 4, 5, 6], ti.f32)
y = ti.Vector([1, 1, 1, 1, 1, 1], ti.f32)
test_kernel(x, y)

Log/Screenshots

$ python3 ti_vector_add.py 
[Taichi] version 1.7.1, llvm 15.0.4, commit 0f143b2f, linux, python 3.8.10
[Taichi] Starting on arch=cuda
[2.000000, 3.000000, 4.000000, 5.000000, 6.000000, 7.000000]
[1.000000, 2.000000, 3.000000, 4.000000, 5.000000, 6.000000]
[1.000000, 2.000000, 3.000000, 4.000000, 5.000000, 6.000000]

Additional comments None

Dinngger commented 4 months ago

Another bug found.

import taichi as ti
ti.init(ti.cuda)

@ti.kernel
def mytest():
    for _ in range(1):
        x = ti.Vector.zero(ti.f32, 1)
        x[0] = 1
        res = ti.Vector.zero(ti.f32, 1)
        for i in range(1):
            res[i] = 2
        x = res
        print(x)
        print(x[0])

mytest()

The output is

$ python3 ti_vec_func2.py
[Taichi] version 1.8.0, llvm 15.0.4, commit 0da68467, linux, python 3.8.10
[Taichi] Starting on arch=cuda
[2.000000]
1.000000
17794 commented 4 months ago

I also found a puzzling manifestation of taichi,

import taichi as ti
import numpy as np

ti.init(arch=ti.gpu)
num = int(1e9)
@ti.kernel
def speed_test()->ti.float32:
   ti.loop_config(serialize=True)
   s2 = 0.0
   for i in range(num):
      s2+=1.0
   return s2
print(speed_test())
a = np.ones(num,dtype=np.float32)
res = np.array(0.0,dtype=np.float32)
a.sum(out=res)
print(res)

The output is:

[Taichi] version 1.7.1, llvm 15.0.4, commit 0f143b2f, linux, python 3.11.9
[Taichi] Starting on arch=cuda
536870912.0
1000000000.0
bobcao3 commented 3 months ago

@Dinngger for your original post, that case is not a bug. Kernel parameters are constant, we supported the syntax because some people relied on the C-like behavior, but the results only persists within a single offload. (e.g. within a single parallel for loop)

We probably should make it a warning

For the second case, weird indeed. I would guess some bug with load to store forwarding did not detect a modification