Autodesk / XLB

XLB: Accelerated Lattice Boltzmann (XLB) for Physics-based ML
Other
228 stars 24 forks source link

【major-refactoring】Need help about lanch run #66

Closed wangguan1995 closed 1 month ago

wangguan1995 commented 1 month ago
python          3.9
warp-lang           1.0.2
jax                 0.4.20
jaxlib              0.4.20
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2023 NVIDIA Corporation
Built on Fri_Nov__3_17:16:49_PDT_2023
Cuda compilation tools, release 12.3, V12.3.103
Build cuda_12.3.r12.3/compiler.33492891_0
python examples/cfd/windtunnel_3d.py 

==================================================

Simulation Configuration:
Grid size: 512 x 128 x 128
Backend: ComputeBackend.WARP
Velocity set: D3Q27
Precision policy: PrecisionPolicy.FP32FP32
Prescribed velocity: 0.02
Reynolds number: 50000.0
Max iterations: 100000

==================================================

Warp 1.0.2 initialized:
   CUDA Toolkit 11.5, Driver 11.4
   Devices:
     "cpu"      : "x86_64"
     "cuda:0"   : "Tesla V100-SXM2-32GB" (32 GiB, sm_70, mempool enabled)
     "cuda:1"   : "Tesla V100-SXM2-32GB" (32 GiB, sm_70, mempool enabled)
     "cuda:2"   : "Tesla V100-SXM2-32GB" (32 GiB, sm_70, mempool enabled)
     "cuda:3"   : "Tesla V100-SXM2-32GB" (32 GiB, sm_70, mempool enabled)
     "cuda:4"   : "Tesla V100-SXM2-32GB" (32 GiB, sm_70, mempool enabled)
     "cuda:5"   : "Tesla V100-SXM2-32GB" (32 GiB, sm_70, mempool enabled)
     "cuda:6"   : "Tesla V100-SXM2-32GB" (32 GiB, sm_70, mempool enabled)
     "cuda:7"   : "Tesla V100-SXM2-32GB" (32 GiB, sm_70, mempool enabled)
   CUDA peer access:
     Supported fully (all-directional)
   Kernel cache:
     /root/.cache/warp/1.0.2
Module xlb.operator.boundary_masker.indices_boundary_masker load on device 'cuda:0' took 26.24 ms
Module xlb.operator.boundary_masker.mesh_boundary_masker load on device 'cuda:0' took 7.66 ms
Module xlb.operator.equilibrium.quadratic_equilibrium load on device 'cuda:0' took 245.91 ms
Warp NVRTC compilation error 6: NVRTC_ERROR_COMPILATION (/buildAgent/work/a9ae500d09a78409/warp/native/warp.cu:2514)
default_program(145): warning #550-D: variable "var_21" was set but never used

default_program(149): warning #550-D: variable "var_25" was set but never used

default_program(701): warning #550-D: variable "adj_14" was set but never used

default_program(714): warning #550-D: variable "adj_27" was set but never used

default_program(723): warning #550-D: variable "adj_36" was set but never used

default_program(735): warning #550-D: variable "adj_48" was set but never used

default_program(744): warning #550-D: variable "adj_57" was set but never used

default_program(756): warning #550-D: variable "adj_69" was set but never used

default_program(1195): warning #550-D: variable "adj_6" was set but never used

default_program(1207): warning #550-D: variable "adj_18" was set but never used

default_program(1208): warning #550-D: variable "adj_19" was set but never used

default_program(1489): warning #177-D: variable "adj_2" was declared but never referenced

default_program(1498): warning #550-D: variable "adj_11" was set but never used

default_program(2464): warning #550-D: variable "adj_7" was set but never used

default_program(2611): warning #550-D: variable "adj_7" was set but never used

default_program(3467): warning #550-D: variable "adj_9" was set but never used

default_program(3474): warning #550-D: variable "adj_16" was set but never used

default_program(3482): warning #550-D: variable "adj_24" was set but never used

default_program(3489): warning #550-D: variable "adj_31" was set but never used

default_program(3497): warning #550-D: variable "adj_39" was set but never used

default_program(3504): warning #550-D: variable "adj_46" was set but never used

default_program(3809): error: function "QuadraticEquilibrium___construct_warp__locals__functional" has already been defined

default_program(3998): error: function "adj_QuadraticEquilibrium___construct_warp__locals__functional" has already been defined

default_program(6322): warning #550-D: variable "adj_7" was set but never used

default_program(68): warning #177-D: function "adj_IncompressibleNavierStokesStepper___construct_warp__locals__BoundaryConditionIDStruct" was declared but never referenced

2 errors detected in the compilation of "default_program".
Module xlb.operator.stepper.nse_stepper load on device 'cuda:0' took 480.77 ms
Traceback (most recent call last):
  File "/workspace/XLB/examples/cfd/windtunnel_3d.py", line 248, in <module>
    simulation.run(num_steps, print_interval, post_process_interval=1000)
  File "/workspace/XLB/examples/cfd/windtunnel_3d.py", line 138, in run
    self.f_1 = self.stepper(self.f_0, self.f_1, self.bc_mask, self.missing_mask, i)
  File "/workspace/XLB/xlb/operator/operator.py", line 74, in __call__
    raise Exception(f"Error captured for backend with key {key} for operator {self.__class__.__name__}: {error}\n {traceback_str}")
Exception: Error captured for backend with key ('IncompressibleNavierStokesStepper', <ComputeBackend.WARP: 2>, '(self, f_0, f_1, bc_mask, missing_mask, timestep)') for operator IncompressibleNavierStokesStepper: CUDA kernel build failed with error code 6
 Traceback (most recent call last):
  File "/workspace/XLB/xlb/operator/operator.py", line 64, in __call__
    result = backend_method(self, *args, **kwargs)
  File "/workspace/XLB/xlb/operator/stepper/nse_stepper.py", line 404, in warp_implementation
    wp.launch(
  File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/warp/context.py", line 4234, in launch
    if not module.load(device):
  File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/warp/context.py", line 1691, in load
    raise (e)
  File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/warp/context.py", line 1669, in load
    warp.build.build_cuda(
  File "/root/miniconda3/envs/py3.9/lib/python3.9/site-packages/warp/build.py", line 30, in build_cuda
    raise Exception(f"CUDA kernel build failed with error code {err}")
Exception: CUDA kernel build failed with error code 6
wangguan1995 commented 1 month ago

I've seen this one: https://github.com/Autodesk/XLB/issues/64

mehdiataei commented 1 month ago

Please install Warp-lang from source to mitigate this issue. The next release of Warp will have this problem fixed.