taichi-dev / taichi

Productive, portable, and performant GPU programming in Python.
https://taichi-lang.org
Apache License 2.0
25.41k stars 2.27k forks source link

Unwanted value while passing values between sparse field #8519

Open Ruoyu66666 opened 5 months ago

Ruoyu66666 commented 5 months ago

Hello,

In this Kernel, particles from neighbor lattices stream towards the local lattice:

import taichi as ti
ti.init(arch=ti.cpu,random_seed=84,default_fp=ti.f64,device_memory_GB=10)
q=19
@ti.kernel
def Streaming():
    for i,j,k in ti.ndrange((4, lx-4),(4, ly-4),(4, lz-4)):
        solid_loc = solid_map[i,j,k]
        if (solid_loc == 0):
            vct_loc = ti.Vector([i,j,k])
            for s in range(1,q):
                ii,jj,kk = periodic_index2(vct_loc+c[s])
                solid_shift = solid_map[ii,jj,kk]
                if (solid_shift==0):
                    f_r_1[s,ii,jj,kk] = f_r[s,i,j,k]
                    f_b_1[s,ii,jj,kk] = f_b[s,i,j,k]

With all fields defined in sparse structure:

solid_map = ti.field(ti.i8)
f_r = ti.field(ti.f64)
f_b = ti.field(ti.f64)
f_r_1 = ti.field(ti.f64)
f_b_1 = ti.field(ti.f64)

n_mem_partition = 2

cell1 = ti.root.pointer(ti.ijk, (lx//n_mem_partition+1,ly//n_mem_partition+1,lz//n_mem_partition+1))
cell1.dense(ti.ijk,(n_mem_partition,n_mem_partition,n_mem_partition)).place(solid_map)

cell3 = ti.root.pointer(ti.ijkl, (1,lx//n_mem_partition+1,ly//n_mem_partition+1,lz//n_mem_partition+1))
cell3.dense(ti.ijkl,(32,n_mem_partition,n_mem_partition,n_mem_partition)).place(f_r)
cell3.dense(ti.ijkl,(32,n_mem_partition,n_mem_partition,n_mem_partition)).place(f_b)
cell3.dense(ti.ijkl,(32,n_mem_partition,n_mem_partition,n_mem_partition)).place(f_r_1)
cell3.dense(ti.ijkl,(32,n_mem_partition,n_mem_partition,n_mem_partition)).place(f_b_1)

Also the function:

@ti.func
def periodic_index2(vct_loc):
    loc_out = vct_loc
    if vct_loc[0]<0:     loc_out[0] = lx-1
    if vct_loc[0]>lx-1:  loc_out[0] = 0
    if vct_loc[1]<0:     loc_out[1] = ly-1
    if vct_loc[1]>ly-1:  loc_out[1] = 0
    if vct_loc[2]<0:     loc_out[2] = lz-1
    if vct_loc[2]>lz-1:  loc_out[2] = 0

    return loc_out[0],loc_out[1],loc_out[2] 

The simulation crushes after tens of steps. In one specific step, the f_r_1 get a large value which does not come from f_r (same for f_b), where the error generates the first time.

It looks like an overflow error, while it does not happen in the initial several steps. Could you please help me with any suggestions? Thank you ahead.