taichi-dev / taichi

Productive, portable, and performant GPU programming in Python.
https://taichi-lang.org
Apache License 2.0
25.5k stars 2.28k forks source link

A bug related to blur that appears in vulkan backend only #5312

Closed theAfish closed 2 years ago

theAfish commented 2 years ago

I am writing a simulation based on blurring Gaussian noise field and happen to find the simulation is abnormal in vulkan backend but works fine in GPU and CPU backend (Please see the screenshots below).

To Reproduce Here's a sample code extracted from my code

import time
import taichi as ti

PI = 3.14159265
ti.init(arch=ti.vulkan, random_seed=int(time.time()))   # cpu/gpu ok, vulkan not ok
N=(256,256)
phi = ti.Vector.field(2, float, shape=N)
theta = ti.field(float, shape=N)
noise = ti.Vector.field(2, float, shape=N)
noise_ = ti.Vector.field(2, float, shape=N)
img = ti.Vector.field(3, float, shape=N)
window = ti.ui.Window("test", res=(512, 512))
canvas = window.get_canvas()

@ti.kernel
def add_noise():
    for x, y in noise:
        noise[x, y].x = ti._funcs.randn()
        noise[x, y].y = ti._funcs.randn()

@ti.kernel
def blur():
    for x, y in noise:
        noise_[x, y] += noise[x,y+1] + noise[x+1,y] + noise[x,y-1] + noise[x-1, y]
        noise_[x, y] /= 5.00

@ti.kernel
def set_noise():
    for x, y in phi:
        phi[x, y] = noise[x, y] * (1.-phi[x,y])
        theta[x, y] = ti.atan2(phi[x, y].y, phi[x, y].x)

@ti.kernel
def draw():
    for x, y in img:
        img[x, y] = ti.Vector([ti.sin(theta[x, y])+1., ti.sin(theta[x, y]+2*PI/3)+1., ti.sin(theta[x, y]+4*PI/3)+1.])/2 * phi[x, y].norm()

while window.running:
    noise.fill(0)
    noise_.fill(0)
    add_noise()
    for _ in range(5):
        blur()
        noise.copy_from(noise_)
    set_noise()
    draw()
    canvas.set_image(img)
    window.show()

Log/Screenshots Let the code run for several seconds and I can see something different Running with GPU/CPU: image

Running with vulkan: image

The log for vulkan, GPU, and CPU are all the same:

$ python my_sample_code.py
[Taichi] version 1.0.3, llvm 10.0.0, commit fae94a21, win, python 3.7.9
[Taichi] Starting on arch=vulkan
[I 07/02/22 11:10:02.180 56108] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device_creator.cpp:pick_physical_device@364] Found Vulkan Device 0 (NVIDIA GeForce RTX 2060)
[I 07/02/22 11:10:02.180 56108] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device_creator.cpp:find_queue_families@142] Async compute queue 2, graphics queue 0
[I 07/02/22 11:10:02.180 56108] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device_creator.cpp:find_queue_families@142] Async compute queue 2, graphics queue 0
[I 07/02/22 11:10:02.180 56108] [D:/a/taichi/taichi/taichi/backends/vulkan/vulkan_device_creator.cpp:create_logical_device@432] Vulkan Device "NVIDIA GeForce RTX 2060" supports Vulkan 0 version 1.3.194
GLFW Error 65537: The GLFW library is not initialized
...

Additional comments

PS F:\OneDrive\PythonProjects> ti diagnose
[Taichi] version 1.0.3, llvm 10.0.0, commit fae94a21, win, python 3.7.9

*******************************************
**      Taichi Programming Language      **
*******************************************

Docs:   https://docs.taichi-lang.org/
GitHub: https://github.com/taichi-dev/taichi/
Forum:  https://forum.taichi.graphics/

Taichi system diagnose:

python: 3.7.9 (default, Aug 31 2020, 17:10:11) [MSC v.1916 64 bit (AMD64)]
system: win32
executable: e:\pymol\python.exe
platform: Windows-10-10.0.22581-SP0
architecture: 64bit WindowsPE
uname: uname_result(system='Windows', node='LAPTOP-F8O37MQF', release='10', version='10.0.22581', machine='AMD64', processor='Intel64 Family 6 Model 165 Stepping 2, GenuineIntel')
locale: zh_CN.cp936
PATH: E:\Pymol;E:\Pymol\Library\mingw-w64\bin;E:\Pymol\Library\usr\bin;E:\Pymol\Library\bin;E:\Pymol\Scripts;E:\Pymol\bin;E:\Pymol\condabin;C:\Program Files (x86)\Common Files\Intel\Share
d Libraries\redist\intel64_win\compiler;D:\Vulkan\Bin;C:\Program Files\Common Files\Oracle\Java\javapath;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.4\bin;C:\Program Files\NVID
IA GPU Computing Toolkit\CUDA\v11.4\libnvvp;C:\Program Files (x86)\Common Files\Intel\Shared Libraries\redist\intel64_win\compiler;C:\Program Files (x86)\Common Files\Raytrix Shared\Bin;C
:\Program Files\PlasticSCM5\server;C:\Program Files\PlasticSCM5\client;C:\Program Files\Java\jdk1.8.0_281\bin;C:\Program Files (x86)\Common Files\Oracle\Java\javapath;C:\windows\system32;
C:\windows;C:\windows\System32\Wbem;C:\windows\System32\WindowsPowerShell\v1.0;C:\windows\System32\OpenSSH;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;C:\Program Files\NVIDIA C
orporation\NVIDIA NvDLISR;D:\Git\cmd;C:\Program Files\Wolfram Research\WolframScript;E:\Anaconda3;E:\Anaconda3\Scripts;E:\Anaconda3\Library\bin;E:\Atompsk\Atomsk;C:\Program Files\NVIDIA C
orporation\Nsight Compute 2021.2.0;C:\Program Files\Java\jdk-16.0.2;.;C:\Program Files\Docker\Docker\resources\bin;C:\ProgramData\DockerDesktop\version-bin;C:\WINDOWS\system32;C:\WINDOWS;
C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0;C:\WINDOWS\System32\OpenSSH;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShe
ll\v1.0;C:\WINDOWS\System32\OpenSSH;"C:\Program Files\Java\jdk-16.0.2\bin;C:\Users\taoyu\AppData\Local\Microsoft\WindowsApps";.;C:\Users\taoyu\.babun;C:\Users\taoyu\AppData\Local\Microsof
t\WindowsApps;D:\Microsoft VS Code\bin;e:\pymol\lib\site-packages\taichi\_lib
PYTHONPATH: ['E:\\Pymol\\Scripts\\ti.exe', 'e:\\pymol\\python37.zip', 'e:\\pymol\\DLLs', 'e:\\pymol\\lib', 'e:\\pymol', 'e:\\pymol\\lib\\site-packages', 'e:\\pymol\\lib\\site-packages\\wi
n32', 'e:\\pymol\\lib\\site-packages\\win32\\lib', 'e:\\pymol\\lib\\site-packages\\Pythonwin']

`lsb_release` not available: [WinError 2] 系统找不到指定的文件。

import: <module 'taichi' from 'e:\\pymol\\lib\\site-packages\\taichi\\__init__.py'>

cc: False
cpu: True
metal: False
opengl: True
cuda: True
vulkan: True

Sat Jul  2 11:15:19 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 512.78       Driver Version: 512.78       CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:01:00.0  On |                  N/A |
| N/A   52C    P0    43W /  N/A |    675MiB /  6144MiB |     47%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2180    C+G   ...d\runtime\WeChatAppEx.exe    N/A      |
|    0   N/A  N/A     10904    C+G   ...bbwe\Microsoft.Photos.exe    N/A      |
|    0   N/A  N/A     11396    C+G   ...perience\NVIDIA Share.exe    N/A      |
|    0   N/A  N/A     22596    C+G   ...e\Current\LogiOverlay.exe    N/A      |
|    0   N/A  N/A     30800    C+G   ...artMenuExperienceHost.exe    N/A      |
|    0   N/A  N/A     37596    C+G   ...urrent\LogiOptionsMgr.exe    N/A      |
|    0   N/A  N/A     40164    C+G   ...n1h2txyewy\SearchHost.exe    N/A      |
|    0   N/A  N/A     40456    C+G   D:\Ovito\ovito.exe              N/A      |
|    0   N/A  N/A     41604    C+G   ...ge\Application\msedge.exe    N/A      |
|    0   N/A  N/A     41644    C+G   ...mmandCenterBackground.exe    N/A      |
|    0   N/A  N/A     42908    C+G   ...lPanel\SystemSettings.exe    N/A      |
|    0   N/A  N/A     47456    C+G   ...vhw5zqvyzm\DsecForUWP.exe    N/A      |
|    0   N/A  N/A     49188    C+G   ...8bbwe\WindowsTerminal.exe    N/A      |
|    0   N/A  N/A     50204    C+G   ...y\ShellExperienceHost.exe    N/A      |
|    0   N/A  N/A     53556    C+G   ...tracted\WechatBrowser.exe    N/A      |
|    0   N/A  N/A     54884    C+G   C:\Windows\explorer.exe         N/A      |
|    0   N/A  N/A     55084    C+G   ...jag6ke6\HP.JumpStarts.exe    N/A      |
|    0   N/A  N/A     55240    C+G   ...kzcwy\mcafee-security.exe    N/A      |
|    0   N/A  N/A     56120    C+G   ...ystemEventUtilityHost.exe    N/A      |
|    0   N/A  N/A     56244    C+G   ...ekyb3d8bbwe\HxOutlook.exe    N/A      |
|    0   N/A  N/A     56344    C+G   ...ightStudio-background.exe    N/A      |
|    0   N/A  N/A     56632    C+G   ...2txyewy\TextInputHost.exe    N/A      |
+-----------------------------------------------------------------------------+

[Taichi] version 1.0.3, llvm 10.0.0, commit fae94a21, win, python 3.7.9

[Taichi] version 1.0.3, llvm 10.0.0, commit fae94a21, win, python 3.7.9
[Taichi] Starting on arch=x64

[Taichi] version 1.0.3, llvm 10.0.0, commit fae94a21, win, python 3.7.9
[Taichi] Starting on arch=opengl

[Taichi] version 1.0.3, llvm 10.0.0, commit fae94a21, win, python 3.7.9
[Taichi] Starting on arch=cuda

[Taichi] version 1.0.3, llvm 10.0.0, commit fae94a21, win, python 3.7.9

*******************************************
**      Taichi Programming Language      **
*******************************************

Docs:   https://docs.taichi-lang.org/
GitHub: https://github.com/taichi-dev/taichi/
Forum:  https://forum.taichi.graphics/

Running example minimal ...
[Taichi] Starting on arch=x64
42.0
>>> Running time: 0.35s
42

Consider attaching this log when maintainers ask about system information.
>>> Running time: 10.11s
bobcao3 commented 2 years ago

Could be a different handling for out-of-bound color values. Can you try to clamp the values between 0 and 1 and check whether they have the same output?

theAfish commented 2 years ago

Not work. I believe this is not because of coloring of the "img field", since once the "white square" appears, it won't disappear in the following frames. And this will make my simulation based on the noise unstable in the vulkan backend but not in gpu/cpu backend.

neozhaoliang commented 2 years ago

I think it's because of floating overflow on vulkan backend, the randnfunction can return arbitrarily large float in 32-bit range, and multiplying them together may result in overflow. The squares dissapear if one clamps phi to a small range like [-10, 10].

theAfish commented 2 years ago

Thank you very much!