google / forma

An efficient vector-graphics renderer
Apache License 2.0
2.62k stars 51 forks source link

Bug showing up in the circles demo #28

Open ckaran opened 1 year ago

ckaran commented 1 year ago

I just ran across forma and decided to see how well it performs by running the circles demo with larger and larger numbers of circles. At 100,000 circles, things get weird. The CPU renderer worked perfectly, but the GPU (both high and low) had weird tearing artifacts (if you have someplace I can upload a video to, I can work to capture one for you so you can see what I'm seeing).

The commands I used were the following:

cargo run --release -p demo -- c circles 100000
cargo run --release -p demo -- l circles 100000
cargo run --release -p demo -- h circles 100000

I'm not a graphics person, so I'm not sure if I'm doing something wrong, I'm just trying to use forma as an easy-to-use 2D vector drawing library. If I'm supposed to be doing something different, please let me know.

Meta

lsb_release -a:
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 22.04.1 LTS
Release:    22.04
Codename:   jammy

uname -a:
Linux Anvil 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

rustc -vV:
rustc 1.66.0 (69f9c33d7 2022-12-12)
binary: rustc
commit-hash: 69f9c33d71c871fc16ac445211281c6e7a340943
commit-date: 2022-12-12
host: x86_64-unknown-linux-gnu
release: 1.66.0
LLVM version: 15.0.2

cargo -vV:
cargo 1.66.0 (d65d197ad 2022-11-15)
release: 1.66.0
commit-hash: d65d197ad5c6c09234369f219f943e291d4f04b9
commit-date: 2022-11-15
host: x86_64-unknown-linux-gnu
libgit2: 1.5.0 (sys:0.15.0 vendored)
libcurl: 7.83.1-DEV (sys:0.4.55+curl-7.83.1 vendored ssl:OpenSSL/1.1.1q)
os: Ubuntu 22.04 (jammy) [64-bit]

rustc +beta -vV:
rustc 1.67.0-beta.2 (352eb59a4 2022-12-13)
binary: rustc
commit-hash: 352eb59a4c33abf739914422f2ad975925750146
commit-date: 2022-12-13
host: x86_64-unknown-linux-gnu
release: 1.67.0-beta.2
LLVM version: 15.0.6

cargo +beta -vV:
cargo 1.67.0-beta.2 (f6e737b1e 2022-12-02)
release: 1.67.0-beta.2
commit-hash: f6e737b1e3386adb89333bf06a01f68a91ac5306
commit-date: 2022-12-02
host: x86_64-unknown-linux-gnu
libgit2: 1.5.0 (sys:0.15.0 vendored)
libcurl: 7.86.0-DEV (sys:0.4.59+curl-7.86.0 vendored ssl:OpenSSL/1.1.1q)
os: Ubuntu 22.04 (jammy) [64-bit]

rustc +nightly -vV:
rustc 1.68.0-nightly (935dc0721 2022-12-19)
binary: rustc
commit-hash: 935dc07218b4bf6e20231e44eb9263b612fd649b
commit-date: 2022-12-19
host: x86_64-unknown-linux-gnu
release: 1.68.0-nightly
LLVM version: 15.0.6

cargo +nightly -vV:
cargo 1.68.0-nightly (c994a4a63 2022-12-18)
release: 1.68.0-nightly
commit-hash: c994a4a638370bc7e0ffcbb0e2865afdfa7d4415
commit-date: 2022-12-18
host: x86_64-unknown-linux-gnu
libgit2: 1.5.0 (sys:0.15.0 vendored)
libcurl: 7.86.0-DEV (sys:0.4.59+curl-7.86.0 vendored ssl:OpenSSL/1.1.1q)
os: Ubuntu 22.04 (jammy) [64-bit]

gcc -v:
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.3.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04) 

g++ -v:
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/11/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu 11.3.0-1ubuntu1~22.04' --with-bugurl=file:///usr/share/doc/gcc-11/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-11 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --enable-cet --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-11-xKiWfi/gcc-11-11.3.0/debian/tmp-gcn/usr --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-serialization=2
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 11.3.0 (Ubuntu 11.3.0-1ubuntu1~22.04) 

clang -v:
Ubuntu clang version 14.0.0-1ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
Candidate multilib: .;@m64
Selected multilib: .;@m64

clang++ -v:
Ubuntu clang version 14.0.0-1ubuntu1
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/11
Found candidate GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
Selected GCC installation: /usr/bin/../lib/gcc/x86_64-linux-gnu/12
Candidate multilib: .;@m64
Selected multilib: .;@m64

$ nvidia-smi
Tue Dec 20 16:09:55 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.60.11    Driver Version: 525.60.11    CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA RTX A400...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   49C    P8    16W /  N/A |    388MiB /  8192MiB |     10%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      2553      G   /usr/lib/xorg/Xorg                192MiB |
|    0   N/A  N/A      2687      G   /usr/bin/gnome-shell               51MiB |
|    0   N/A  N/A    198605      G   ...1/usr/lib/firefox/firefox      118MiB |
+-----------------------------------------------------------------------------+
dragostis commented 1 year ago

Thanks for reporting this. This is a known limitation on GPU: if too many layers overlap in one pixel, it will create artefacts. Right now, we don't have a way to record this, but we should add a flag and simple error-out when it happens.

ckaran commented 1 year ago

How many layers can overlap?

dragostis commented 1 year ago

The number of layers that can overlap inside of a tile is not limited, only the layer that cross tile borders. These are limited by the queue that's passed from one layer to another. This is currently 128, but I think it makes sense to try and spill to global memory as well, e.g. 4096. Even in that case, we might still have cases where that limit is reached, so we will still need a way to report back that the render is incorrect.

ckaran commented 1 year ago

Thank you for the explanation. I agree that it would make sense to report back a rendering error as it may not always be obvious when it could occur (e.g., if stuff is moving around the screen randomly). Do you know what the maximum queue length can be? Or if there is a way of dynamically checking and updating it? My thought is that if Forma can probe the GPU to determine what the maximum queue length is at program startup, then we can adjust our code using info that Forma provides (if that makes sense).

dragostis commented 1 year ago

For now, the queue will always be 128. Even when not enough memory is available, the shader should be able to simply spill this to global memory. Once the error reporting is put in place, that should report back the maximum number. However, keep in mind that this number is quite hard to reach: even with very transparent objects, not much is comprehensible after so many blends.

ckaran commented 1 year ago

You're right that 128 layers is a bit much. My concern was just about the artifacts that are produced when things are moving around randomly, like if you're doing something like this, or if you're rendering particles for a physics simulation of ideal gas particles in a box (simulation is 3D, rendering uses sprites or forma and orthographic projection as a quick & dirty visualizer). Not exactly a major concern of yours, I know, but that is the kind of stuff I was thinking about.

dragostis commented 1 year ago

That makes sense. This is definitely a use case we care about. I think the global memory spill approach would basically solve this issue almost completely and we should focus on it.

ckaran commented 1 year ago

Sounds good to me; I don't have a graphics background, so I have to trust your judgement on this.