Describe the bug
I am trying to use GGUI in nvidia docker. But it failed. The docker is OK to run CUDA applications. And only the GGUI failed. If disable GGUI, everything works well.
When use cpu backend for taichi, everything works fine.
Log/Screenshots
Please post the full log of the program (instead of just a few lines around the error message, unless the log is > 1000 lines). This will help us diagnose what's happening. For example:
$ roslaunch taichislam taichislam-d435.launch show:=true output_map:=true
[Taichi] version 1.2.0, llvm 10.0.0, commit f189fd79, linux, python 3.8.10
[Taichi] Starting on arch=cuda
[I 10/28/22 22:51:47.740 267] [vulkan_device_creator.cpp:pick_physical_device@394] Found Vulkan Device 0 (llvmpipe (LLVM 12.0.0, 256 bits))
[I 10/28/22 22:51:47.740 267] [vulkan_device_creator.cpp:create_logical_device@462] Vulkan Device "llvmpipe (LLVM 12.0.0, 256 bits)" supports Vulkan 0 version 1.1.182
WARNING: lavapipe is not a conformant vulkan implementation, testing use only.
[I 10/28/22 22:51:47.745 267] [vulkan_device.cpp:create_swap_chain@2416] Creating suface of 1920x1080
Initializing submap with tsdf...
[SyncBagPlayer] Ready.
[SyncBagPlayer] Updated SYS_T0 1666997506.8756475 BAG_T0 1666617549.55239 Start bag 0.0
TSDF map initialized blocks 63x63x7
TSDF map initialized blocks 63x63x7
TaichiSLAMNode initialized
[E 10/28/22 22:51:50.562 267] [vulkan_cuda_interop.cpp:get_device_mem_handle@64] vkGetMemoryFdKHR is nullptr
Traceback (most recent call last):
File "/home/xuhao/d2slam_ws/src/TaichiSLAM/scripts/taichislam_node.py", line 384, in <module>
taichislamnode.rendering()
File "/home/xuhao/d2slam_ws/src/TaichiSLAM/scripts/taichislam_node.py", line 256, in rendering
self.render.rendering()
File "/home/xuhao/d2slam_ws/src/TaichiSLAM/scripts/../taichi_slam/utils/visualization.py", line 212, in rendering
self.canvas.scene(scene)
File "/usr/local/lib/python3.8/dist-packages/taichi/ui/canvas.py", line 136, in scene
self.canvas.scene(scene.scene)
RuntimeError: [vulkan_cuda_interop.cpp:get_device_mem_handle@64] vkGetMemoryFdKHR is nullptr
...
Additional comments
If possible, please also consider attaching the output of command ti diagnose. This produces the detailed environment information and hopefully helps us diagnose faster.
root@2921bc077313:~/swarm_ws# ti diagnose
[Taichi] version 1.2.0, llvm 10.0.0, commit f189fd79, linux, python 3.8.10
*******************************************
** Taichi Programming Language **
*******************************************
Docs: https://docs.taichi-lang.org/
GitHub: https://github.com/taichi-dev/taichi/
Forum: https://forum.taichi.graphics/
Taichi system diagnose:
python: 3.8.10 (default, Jun 22 2022, 20:18:18)
[GCC 9.4.0]
system: linux
executable: /usr/bin/python
platform: Linux-5.4.0-128-generic-x86_64-with-glibc2.29
architecture: 64bit ELF
uname: uname_result(system='Linux', node='2921bc077313', release='5.4.0-128-generic', version='#144-Ubuntu SMP Tue Sep 20 11:00:04 UTC 2022', machine='x86_64', processor='x86_64')
locale: en_US.UTF-8
PATH: /opt/tensorrt/bin:/usr/local/mpi/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/local/ucx/bin
PYTHONPATH: ['/usr/local/bin', '/usr/lib/python38.zip', '/usr/lib/python3.8', '/usr/lib/python3.8/lib-dynload', '/usr/local/lib/python3.8/dist-packages', '/usr/lib/python3/dist-packages']
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.04.4 LTS
Release: 20.04
Codename: focal
import: <module 'taichi' from '/usr/local/lib/python3.8/dist-packages/taichi/__init__.py'>
cc: False
cpu: True
metal: False
opengl: False
cuda: True
vulkan: True
`glewinfo` not available: [Errno 2] No such file or directory: 'glewinfo'
Sat Oct 29 01:26:57 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:2B:00.0 On | N/A |
| 0% 31C P8 17W / 320W | 935MiB / 10240MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
[Taichi] version 1.2.0, llvm 10.0.0, commit f189fd79, linux, python 3.8.10
[Taichi] version 1.2.0, llvm 10.0.0, commit f189fd79, linux, python 3.8.10
[Taichi] Starting on arch=x64
[W 10/29/22 01:26:58.112 379] [opengl_api.cpp:initialize_opengl@162] Can not create OpenGL context
[W 10/29/22 01:26:58.115 379] [misc.py:adaptive_arch_select@755] Arch=[<Arch.opengl: 7>] is not supported, falling back to CPU
[Taichi] version 1.2.0, llvm 10.0.0, commit f189fd79, linux, python 3.8.10
[Taichi] Starting on arch=x64
[Taichi] version 1.2.0, llvm 10.0.0, commit f189fd79, linux, python 3.8.10
[Taichi] Starting on arch=cuda
[Taichi] version 1.2.0, llvm 10.0.0, commit f189fd79, linux, python 3.8.10
*******************************************
** Taichi Programming Language **
*******************************************
Docs: https://docs.taichi-lang.org/
GitHub: https://github.com/taichi-dev/taichi/
Forum: https://forum.taichi.graphics/
TAICHI EXAMPLES
──────────────────────────────────────────────────────────────────────────────
0: ad_gravity 23: keyboard 46: patterns
1: comet 24: laplace 47: pbf2d
2: cornell_box 25: laplace_equation 48: physarum
3: diff_sph 26: mandelbrot_zoom 49: print_offset
4: euler 27: marching_squares 50: rasterizer
5: explicit_activation 28: mass_spring_3d_ggui 51: regression
6: export_mesh 29: mass_spring_game 52: sdf_renderer
7: export_ply 30: 53: simple_derivative
mass_spring_game_ggui
8: export_videos 31: mciso_advanced 54: simple_texture
9: fem128 32: mgpcg 55: simple_uv
10: fem128_ggui 33: mgpcg_advanced 56: stable_fluid
11: fem99 34: minimal 57: stable_fluid_ggui
12: fractal 35: minimization 58: stable_fluid_graph
13: fractal3d_ggui 36: mpm128 59: taichi_bitmasked
14: fullscreen 37: mpm128_ggui 60: taichi_dynamic
15: game_of_life 38: mpm3d 61: taichi_logo
16: gui_image_io 39: mpm3d_ggui 62: taichi_sparse
17: gui_widgets 40: mpm88 63: texture_graph
18: implicit_fem 41: mpm88_graph 64: tutorial
19: 42: mpm99 65:
implicit_mass_spring two_stream_instability
20: 43: 66: vortex_rings
initial_value_problem mpm_lagrangian_forces
21: jacobian 44: nbody 67: waterwave
22: 45: odop_solar
karman_vortex_street
──────────────────────────────────────────────────────────────────────────────
42
Running example minimal ...
[Taichi] Starting on arch=x64
42.0
>>> Running time: 0.20s
Consider attaching this log when maintainers ask about system information.
>>> Running time: 4.72s
Describe the bug I am trying to use GGUI in nvidia docker. But it failed. The docker is OK to run CUDA applications. And only the GGUI failed. If disable GGUI, everything works well. When use cpu backend for taichi, everything works fine.
Log/Screenshots Please post the full log of the program (instead of just a few lines around the error message, unless the log is > 1000 lines). This will help us diagnose what's happening. For example:
Additional comments If possible, please also consider attaching the output of command
ti diagnose
. This produces the detailed environment information and hopefully helps us diagnose faster.