tensorflow / mlir

"Multi-Level Intermediate Representation" Compiler Infrastructure
1.73k stars 257 forks source link

three unexpected test failures:mlir-cuda-runner #323

Open MingliSun opened 4 years ago

MingliSun commented 4 years ago
[1732/1733] Running the MLIR regression tests
FAIL: MLIR :: mlir-cuda-runner/all-reduce-op.mlir (362 of 364)
******************** TEST 'MLIR :: mlir-cuda-runner/all-reduce-op.mlir' FAILED ********************
Script:
--
: 'RUN: at line 1';   mlir-cuda-runner /home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/all-reduce-op.mlir --shared-libs=/home/sun/llvm-project/build/lib/libcuda-runtime-wrappers.so,/home/sun/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/sun/llvm-project/build/bin/FileCheck /home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/all-reduce-op.mlir
--
Exit Code: 1

Command Output (stderr):
--
CUDA failed with 700 in StreamSync
/home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/all-reduce-op.mlir:3:19: error: CHECK-COUNT: expected string not found in input (1 out of 8)
// CHECK-COUNT-8: [{{(5356, ){12}5356}}]
                  ^
<stdin>:1:1: note: scanning from here
Unranked Memref rank = 3 descriptor@ = 0x7ffc5073d440 Memref base@ = 0x556b5c6878b0 rank = 3 offset = 0 sizes = [2, 4, 13] strides = [52, 13, 1] data = 
^
<stdin>:9:67: note: possible intended match here
 [1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23]]]
                                                                  ^

--

********************
FAIL: MLIR :: mlir-cuda-runner/all-reduce-region.mlir (363 of 364)
******************** TEST 'MLIR :: mlir-cuda-runner/all-reduce-region.mlir' FAILED ********************
Script:
--
: 'RUN: at line 1';   mlir-cuda-runner /home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/all-reduce-region.mlir --shared-libs=/home/sun/llvm-project/build/lib/libcuda-runtime-wrappers.so,/home/sun/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/sun/llvm-project/build/bin/FileCheck /home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/all-reduce-region.mlir
--
Exit Code: 1

Command Output (stderr):
--
CUDA failed with 700 in StreamSync
/home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/all-reduce-region.mlir:3:11: error: CHECK: expected string not found in input
// CHECK: [{{(35, ){34}35}}]
          ^
<stdin>:1:1: note: scanning from here
Unranked Memref rank = 1 descriptor@ = 0x7ffcde435930 Memref base@ = 0x55f97fdfd980 rank = 1 offset = 0 sizes = [35] strides = [1] data = 
^
<stdin>:2:200: note: possible intended match here
[1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23, 1.23]
                                                                                                                                                                                                       ^

--

********************
FAIL: MLIR :: mlir-cuda-runner/gpu-to-cubin.mlir (364 of 364)
******************** TEST 'MLIR :: mlir-cuda-runner/gpu-to-cubin.mlir' FAILED ********************
Script:
--
: 'RUN: at line 1';   mlir-cuda-runner /home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/gpu-to-cubin.mlir --shared-libs=/home/sun/llvm-project/build/lib/libcuda-runtime-wrappers.so,/home/sun/llvm-project/build/lib/libmlir_runner_utils.so --entry-point-result=void | /home/sun/llvm-project/build/bin/FileCheck /home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/gpu-to-cubin.mlir
--
Exit Code: 1

Command Output (stderr):
--
CUDA failed with 700 in StreamSync
/home/sun/llvm-project/llvm/projects/mlir/test/mlir-cuda-runner/gpu-to-cubin.mlir:15:11: error: CHECK: expected string not found in input
// CHECK: [1, 1, 1, 1, 1]
          ^
<stdin>:1:1: note: scanning from here
Memref base@ = 0x5556e405f280 rank = 1 offset = 0 sizes = [5] strides = [1] data = 
^
<stdin>:2:1: note: possible intended match here
[1.23, 1.23, 1.23, 1.23, 1.23]
^

--

********************

Testing Time: 5.57s
********************
Failing Tests (3):
    MLIR :: mlir-cuda-runner/all-reduce-op.mlir
    MLIR :: mlir-cuda-runner/all-reduce-region.mlir
    MLIR :: mlir-cuda-runner/gpu-to-cubin.mlir

  Expected Passes    : 361
  Unexpected Failures: 3
FAILED: projects/mlir/test/CMakeFiles/check-mlir 
cd /home/sun/llvm-project/build/projects/mlir/test && /usr/bin/python /home/sun/llvm-project/build/./bin/llvm-lit -sv /home/sun/llvm-project/build/projects/mlir/test
ninja: build stopped: subcommand failed.
joker-eph commented 4 years ago

Can you provide some information on your environment? Cuda version, driver version, which GPU are you using, OS, etc.

Thanks

MingliSun commented 4 years ago

nvidia-smi shows that Mon Dec 16 10:17:56 2019
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 430.50 Driver Version: 430.50 CUDA Version: 10.1 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | |===============================+======================+======================| | 0 GeForce 920M Off | 00000000:01:00.0 N/A | N/A | | N/A 45C P5 N/A / N/A | 350MiB / 2004MiB | N/A Default | +-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+ | Processes: GPU Memory | | GPU PID Type Process name Usage | |=============================================================================| | 0 Not Supported | +-----------------------------------------------------------------------------+ OS:ubuntu18.04

ftynse commented 4 years ago

CUDA failed with 700 in StreamSync

700 is an illegal memory access...

xuxbuptisc commented 4 years ago

same error +1...