NVIDIA / cutlass

CUDA Templates for Linear Algebra Subroutines
Other
5.48k stars 927 forks source link

test failed on jetson tx2 #14

Closed wtiandong closed 6 years ago

wtiandong commented 6 years ago

Hi, I run cutlass on jetson tx2(jetpack 3.2), but some tests failed. Here is the information:

nvidia@tegra-ubuntu:~/Documents/cutlass/build$ ./tools/test/unit/cutlass_unit_test Note: Google Test filter = -mma [==========] Running 684 tests from 33 test cases. [----------] Global test environment set-up. [----------] 1 test from HostTensor [ RUN ] HostTensor.gemm [ OK ] HostTensor.gemm (1 ms) [----------] 1 test from HostTensor (1 ms total)

[----------] 2 tests from Layout [ RUN ] Layout.igemm [ OK ] Layout.igemm (0 ms) [ RUN ] Layout.sgemm_accum [ OK ] Layout.sgemm_accum (0 ms) [----------] 2 tests from Layout (0 ms total)

[----------] 1 test from PredicateVector [ RUN ] PredicateVector.Basic [ OK ] PredicateVector.Basic (94 ms) [----------] 1 test from PredicateVector (94 ms total)

[----------] 2 tests from TileIterator [ RUN ] TileIterator.tile_128x8_contiguous [ OK ] TileIterator.tile_128x8_contiguous (1 ms) [ RUN ] TileIterator.tile_128x8_rake [ OK ] TileIterator.tile_128x8_rake (1 ms) [----------] 2 tests from TileIterator (3 ms total)

[----------] 8 tests from Dgemm_64x32x8 [ RUN ] Dgemm_64x32x8.dgemm_64x32x8_nt [ OK ] Dgemm_64x32x8.dgemm_64x32x8_nt (497 ms) [ RUN ] Dgemm_64x32x8.dgemm_256x128x64_nt [ OK ] Dgemm_64x32x8.dgemm_256x128x64_nt (29 ms) [ RUN ] Dgemm_64x32x8.dgemm_64x32x8_nn [ OK ] Dgemm_64x32x8.dgemm_64x32x8_nn (5 ms) [ RUN ] Dgemm_64x32x8.dgemm_256x128x64_nn [ OK ] Dgemm_64x32x8.dgemm_256x128x64_nn (22 ms) [ RUN ] Dgemm_64x32x8.dgemm_64x32x8_tn [ OK ] Dgemm_64x32x8.dgemm_64x32x8_tn (4 ms) [ RUN ] Dgemm_64x32x8.dgemm_256x128x64_tn [ OK ] Dgemm_64x32x8.dgemm_256x128x64_tn (21 ms) [ RUN ] Dgemm_64x32x8.dgemm_64x32x8_tt [ OK ] Dgemm_64x32x8.dgemm_64x32x8_tt (3 ms) [ RUN ] Dgemm_64x32x8.dgemm_256x128x64_tt [ OK ] Dgemm_64x32x8.dgemm_256x128x64_tt (20 ms) [----------] 8 tests from Dgemm_64x32x8 (601 ms total)

[----------] 8 tests from Dgemm_64x64x8 [ RUN ] Dgemm_64x64x8.dgemm_64x64x8_nt [ OK ] Dgemm_64x64x8.dgemm_64x64x8_nt (3 ms) [ RUN ] Dgemm_64x64x8.dgemm_256x128x64_nt [ OK ] Dgemm_64x64x8.dgemm_256x128x64_nt (21 ms) [ RUN ] Dgemm_64x64x8.dgemm_64x64x8_nn [ OK ] Dgemm_64x64x8.dgemm_64x64x8_nn (4 ms) [ RUN ] Dgemm_64x64x8.dgemm_256x128x64_nn [ OK ] Dgemm_64x64x8.dgemm_256x128x64_nn (20 ms) [ RUN ] Dgemm_64x64x8.dgemm_64x64x8_tn [ OK ] Dgemm_64x64x8.dgemm_64x64x8_tn (4 ms) [ RUN ] Dgemm_64x64x8.dgemm_256x128x64_tn [ OK ] Dgemm_64x64x8.dgemm_256x128x64_tn (20 ms) [ RUN ] Dgemm_64x64x8.dgemm_64x64x8_tt [ OK ] Dgemm_64x64x8.dgemm_64x64x8_tt (3 ms) [ RUN ] Dgemm_64x64x8.dgemm_256x128x64_tt [ OK ] Dgemm_64x64x8.dgemm_256x128x64_tt (20 ms) [----------] 8 tests from Dgemm_64x64x8 (95 ms total)

[----------] 8 tests from Dgemm_128x32x8 [ RUN ] Dgemm_128x32x8.dgemm_128x32x8_nt [ OK ] Dgemm_128x32x8.dgemm_128x32x8_nt (4 ms) [ RUN ] Dgemm_128x32x8.dgemm_256x64x64_nt [ OK ] Dgemm_128x32x8.dgemm_256x64x64_nt (12 ms) [ RUN ] Dgemm_128x32x8.dgemm_128x32x8_nn [ OK ] Dgemm_128x32x8.dgemm_128x32x8_nn (4 ms) [ RUN ] Dgemm_128x32x8.dgemm_256x64x64_nn [ OK ] Dgemm_128x32x8.dgemm_256x64x64_nn (12 ms) [ RUN ] Dgemm_128x32x8.dgemm_128x32x8_tn [ OK ] Dgemm_128x32x8.dgemm_128x32x8_tn (3 ms) [ RUN ] Dgemm_128x32x8.dgemm_256x64x64_tn [ OK ] Dgemm_128x32x8.dgemm_256x64x64_tn (12 ms) [ RUN ] Dgemm_128x32x8.dgemm_128x32x8_tt [ OK ] Dgemm_128x32x8.dgemm_128x32x8_tt (3 ms) [ RUN ] Dgemm_128x32x8.dgemm_256x64x64_tt [ OK ] Dgemm_128x32x8.dgemm_256x64x64_tt (11 ms) [----------] 8 tests from Dgemm_128x32x8 (62 ms total)

[----------] 8 tests from Dgemm_128x128x8 [ RUN ] Dgemm_128x128x8.dgemm_128x128x8_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Dgemm_128x128x8.dgemm_128x128x8_nt (162 ms) [ RUN ] Dgemm_128x128x8.dgemm_512x256x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Dgemm_128x128x8.dgemm_512x256x64_nt (1094 ms) [ RUN ] Dgemm_128x128x8.dgemm_128x128x8_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Dgemm_128x128x8.dgemm_128x128x8_nn (118 ms) [ RUN ] Dgemm_128x128x8.dgemm_512x256x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Dgemm_128x128x8.dgemm_512x256x64_nn (1062 ms) [ RUN ] Dgemm_128x128x8.dgemm_128x128x8_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Dgemm_128x128x8.dgemm_128x128x8_tn (115 ms) [ RUN ] Dgemm_128x128x8.dgemm_512x256x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Dgemm_128x128x8.dgemm_512x256x64_tn (993 ms) [ RUN ] Dgemm_128x128x8.dgemm_128x128x8_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Dgemm_128x128x8.dgemm_128x128x8_tt (109 ms) [ RUN ] Dgemm_128x128x8.dgemm_512x256x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Dgemm_128x128x8.dgemm_512x256x64_tt (956 ms) [----------] 8 tests from Dgemm_128x128x8 (4610 ms total)

[----------] 8 tests from Dgemm_64x32x16 [ RUN ] Dgemm_64x32x16.dgemm_64x32x16_nt [ OK ] Dgemm_64x32x16.dgemm_64x32x16_nt (3 ms) [ RUN ] Dgemm_64x32x16.dgemm_256x128x64_nt [ OK ] Dgemm_64x32x16.dgemm_256x128x64_nt (20 ms) [ RUN ] Dgemm_64x32x16.dgemm_64x32x16_nn [ OK ] Dgemm_64x32x16.dgemm_64x32x16_nn (3 ms) [ RUN ] Dgemm_64x32x16.dgemm_256x128x64_nn [ OK ] Dgemm_64x32x16.dgemm_256x128x64_nn (20 ms) [ RUN ] Dgemm_64x32x16.dgemm_64x32x16_tn [ OK ] Dgemm_64x32x16.dgemm_64x32x16_tn (3 ms) [ RUN ] Dgemm_64x32x16.dgemm_256x128x64_tn [ OK ] Dgemm_64x32x16.dgemm_256x128x64_tn (20 ms) [ RUN ] Dgemm_64x32x16.dgemm_64x32x16_tt [ OK ] Dgemm_64x32x16.dgemm_64x32x16_tt (4 ms) [ RUN ] Dgemm_64x32x16.dgemm_256x128x64_tt [ OK ] Dgemm_64x32x16.dgemm_256x128x64_tt (20 ms) [----------] 8 tests from Dgemm_64x32x16 (94 ms total)

[----------] 8 tests from Dgemm_64x64x16 [ RUN ] Dgemm_64x64x16.dgemm_64x64x16_nt [ OK ] Dgemm_64x64x16.dgemm_64x64x16_nt (3 ms) [ RUN ] Dgemm_64x64x16.dgemm_256x128x64_nt [ OK ] Dgemm_64x64x16.dgemm_256x128x64_nt (20 ms) [ RUN ] Dgemm_64x64x16.dgemm_64x64x16_nn [ OK ] Dgemm_64x64x16.dgemm_64x64x16_nn (3 ms) [ RUN ] Dgemm_64x64x16.dgemm_256x128x64_nn [ OK ] Dgemm_64x64x16.dgemm_256x128x64_nn (20 ms) [ RUN ] Dgemm_64x64x16.dgemm_64x64x16_tn [ OK ] Dgemm_64x64x16.dgemm_64x64x16_tn (4 ms) [ RUN ] Dgemm_64x64x16.dgemm_256x128x64_tn [ OK ] Dgemm_64x64x16.dgemm_256x128x64_tn (19 ms) [ RUN ] Dgemm_64x64x16.dgemm_64x64x16_tt [ OK ] Dgemm_64x64x16.dgemm_64x64x16_tt (4 ms) [ RUN ] Dgemm_64x64x16.dgemm_256x128x64_tt [ OK ] Dgemm_64x64x16.dgemm_256x128x64_tt (20 ms) [----------] 8 tests from Dgemm_64x64x16 (94 ms total)

[----------] 8 tests from Dgemm_128x32x16 [ RUN ] Dgemm_128x32x16.dgemm_128x32x8_nt [ OK ] Dgemm_128x32x16.dgemm_128x32x8_nt (4 ms) [ RUN ] Dgemm_128x32x16.dgemm_256x64x64_nt [ OK ] Dgemm_128x32x16.dgemm_256x64x64_nt (12 ms) [ RUN ] Dgemm_128x32x16.dgemm_128x32x16_nn [ OK ] Dgemm_128x32x16.dgemm_128x32x16_nn (4 ms) [ RUN ] Dgemm_128x32x16.dgemm_256x64x64_nn [ OK ] Dgemm_128x32x16.dgemm_256x64x64_nn (12 ms) [ RUN ] Dgemm_128x32x16.dgemm_128x32x8_tn [ OK ] Dgemm_128x32x16.dgemm_128x32x8_tn (4 ms) [ RUN ] Dgemm_128x32x16.dgemm_256x64x64_tn [ OK ] Dgemm_128x32x16.dgemm_256x64x64_tn (11 ms) [ RUN ] Dgemm_128x32x16.dgemm_128x32x8_tt [ OK ] Dgemm_128x32x16.dgemm_128x32x8_tt (4 ms) [ RUN ] Dgemm_128x32x16.dgemm_256x64x64_tt [ OK ] Dgemm_128x32x16.dgemm_256x64x64_tt (12 ms) [----------] 8 tests from Dgemm_128x32x16 (63 ms total)

[----------] 37 tests from Hgemm_128x128x8 [ RUN ] Hgemm_128x128x8.hgemm_128x128x1_nt [ OK ] Hgemm_128x128x8.hgemm_128x128x1_nt (12 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x8_nt [ OK ] Hgemm_128x128x8.hgemm_128x128x8_nt (16 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x9_nt [ OK ] Hgemm_128x128x8.hgemm_128x128x9_nt (16 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x16_nt [ OK ] Hgemm_128x128x8.hgemm_128x128x16_nt (24 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x64_nt [ OK ] Hgemm_128x128x8.hgemm_128x128x64_nt (65 ms) [ RUN ] Hgemm_128x128x8.hgemm_256x128x16_nt [ OK ] Hgemm_128x128x8.hgemm_256x128x16_nt (42 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x256x16_nt [ OK ] Hgemm_128x128x8.hgemm_128x256x16_nt (40 ms) [ RUN ] Hgemm_128x128x8.hgemm_256x256x16_nt [ OK ] Hgemm_128x128x8.hgemm_256x256x16_nt (74 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x2_nn [ OK ] Hgemm_128x128x8.hgemm_128x128x2_nn (8 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x8_nn [ OK ] Hgemm_128x128x8.hgemm_128x128x8_nn (13 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x10_nn [ OK ] Hgemm_128x128x8.hgemm_128x128x10_nn (14 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x16_nn [ OK ] Hgemm_128x128x8.hgemm_128x128x16_nn (19 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x64_nn [ OK ] Hgemm_128x128x8.hgemm_128x128x64_nn (50 ms) [ RUN ] Hgemm_128x128x8.hgemm_256x128x16_nn [ OK ] Hgemm_128x128x8.hgemm_256x128x16_nn (32 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x256x16_nn [ OK ] Hgemm_128x128x8.hgemm_128x256x16_nn (32 ms) [ RUN ] Hgemm_128x128x8.hgemm_256x256x16_nn [ OK ] Hgemm_128x128x8.hgemm_256x256x16_nn (61 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x8_tn [ OK ] Hgemm_128x128x8.hgemm_128x128x8_tn (12 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x10_tn [ OK ] Hgemm_128x128x8.hgemm_128x128x10_tn (14 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x16_tn [ OK ] Hgemm_128x128x8.hgemm_128x128x16_tn (17 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x64_tn [ OK ] Hgemm_128x128x8.hgemm_128x128x64_tn (48 ms) [ RUN ] Hgemm_128x128x8.hgemm_256x128x16_tn [ OK ] Hgemm_128x128x8.hgemm_256x128x16_tn (32 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x256x16_tn [ OK ] Hgemm_128x128x8.hgemm_128x256x16_tn (31 ms) [ RUN ] Hgemm_128x128x8.hgemm_256x256x16_tn [ OK ] Hgemm_128x128x8.hgemm_256x256x16_tn (61 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x8_tt [ OK ] Hgemm_128x128x8.hgemm_128x128x8_tt (12 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x10_tt [ OK ] Hgemm_128x128x8.hgemm_128x128x10_tt (12 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x16_tt [ OK ] Hgemm_128x128x8.hgemm_128x128x16_tt (16 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x64_tt [ OK ] Hgemm_128x128x8.hgemm_128x128x64_tt (47 ms) [ RUN ] Hgemm_128x128x8.hgemm_256x128x16_tt [ OK ] Hgemm_128x128x8.hgemm_256x128x16_tt (31 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x256x16_tt [ OK ] Hgemm_128x128x8.hgemm_128x256x16_tt (30 ms) [ RUN ] Hgemm_128x128x8.hgemm_256x256x16_tt [ OK ] Hgemm_128x128x8.hgemm_256x256x16_tt (57 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x16_alpha2_nt [ OK ] Hgemm_128x128x8.hgemm_128x128x16_alpha2_nt (15 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x16_beta1_nt [ OK ] Hgemm_128x128x8.hgemm_128x128x16_beta1_nt (15 ms) [ RUN ] Hgemm_128x128x8.hgemm_128x128x16_alpha2_beta1_nt [ OK ] Hgemm_128x128x8.hgemm_128x128x16_alpha2_beta1_nt (15 ms) [ RUN ] Hgemm_128x128x8.hgemm_120x112x64_ldg8_nt [ OK ] Hgemm_128x128x8.hgemm_120x112x64_ldg8_nt (38 ms) [ RUN ] Hgemm_128x128x8.hgemm_508x252x120_ragged_nt [ OK ] Hgemm_128x128x8.hgemm_508x252x120_ragged_nt (565 ms) [ RUN ] Hgemm_128x128x8.hgemm_124x126x32_ragged_nt [ OK ] Hgemm_128x128x8.hgemm_124x126x32_ragged_nt (23 ms) [ RUN ] Hgemm_128x128x8.hgemm_124x126x32_ragged_alpha2_beta1_nt [ OK ] Hgemm_128x128x8.hgemm_124x126x32_ragged_alpha2_beta1_nt (24 ms) [----------] 37 tests from Hgemm_128x128x8 (1637 ms total)

[----------] 33 tests from Hgemm_128x128x16 [ RUN ] Hgemm_128x128x16.hgemm_2x2x2_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_2x2x2_nt (2 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x8_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x8_nt (110 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_nt (114 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x17_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x17_nt (115 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x64_nt (151 ms) [ RUN ] Hgemm_128x128x16.hgemm_256x128x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_256x128x16_nt (222 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x256x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x256x16_nt (225 ms) [ RUN ] Hgemm_128x128x16.hgemm_256x256x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_256x256x16_nt (446 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_nn (114 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x18_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x18_nn (116 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x64_nn (151 ms) [ RUN ] Hgemm_128x128x16.hgemm_256x128x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_256x128x16_nn (227 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x256x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x256x16_nn (224 ms) [ RUN ] Hgemm_128x128x16.hgemm_256x256x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_256x256x16_nn (446 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_tn (114 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x18_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x18_tn (116 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x64_tn (150 ms) [ RUN ] Hgemm_128x128x16.hgemm_256x128x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_256x128x16_tn (224 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x256x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x256x16_tn (224 ms) [ RUN ] Hgemm_128x128x16.hgemm_256x256x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_256x256x16_tn (448 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_tt (112 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x18_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x18_tt (113 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x64_tt (149 ms) [ RUN ] Hgemm_128x128x16.hgemm_256x128x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_256x128x16_tt (221 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x256x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x256x16_tt (222 ms) [ RUN ] Hgemm_128x128x16.hgemm_256x256x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_256x256x16_tt (436 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x16_alpha2_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_alpha2_nt (116 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x16_beta1_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_beta1_nt (115 ms) [ RUN ] Hgemm_128x128x16.hgemm_128x128x16_alpha2_beta1_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_alpha2_beta1_nt (123 ms) [ RUN ] Hgemm_128x128x16.hgemm_120x112x64_ldg8_nt [ OK ] Hgemm_128x128x16.hgemm_120x112x64_ldg8_nt (36 ms) [ RUN ] Hgemm_128x128x16.hgemm_508x252x120_ragged_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_508x252x120_ragged_nt (1362 ms) [ RUN ] Hgemm_128x128x16.hgemm_124x126x32_ragged_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_124x126x32_ragged_nt (119 ms) [ RUN ] Hgemm_128x128x16.hgemm_124x126x32_ragged_alpha2_beta1_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Hgemm_128x128x16.hgemm_124x126x32_ragged_alpha2_beta1_nt (122 ms) [----------] 33 tests from Hgemm_128x128x16 (7187 ms total)

[----------] 30 tests from Hgemm_128x32x8 [ RUN ] Hgemm_128x32x8.hgemm_128x32x1_nt [ OK ] Hgemm_128x32x8.hgemm_128x32x1_nt (3 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x8_nt [ OK ] Hgemm_128x32x8.hgemm_128x32x8_nt (4 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x9_nt [ OK ] Hgemm_128x32x8.hgemm_128x32x9_nt (5 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x16_nt [ OK ] Hgemm_128x32x8.hgemm_128x32x16_nt (6 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x32_nt [ OK ] Hgemm_128x32x8.hgemm_128x32x32_nt (8 ms) [ RUN ] Hgemm_128x32x8.hgemm_256x32x16_nt [ OK ] Hgemm_128x32x8.hgemm_256x32x16_nt (9 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x64x16_nt [ OK ] Hgemm_128x32x8.hgemm_128x64x16_nt (9 ms) [ RUN ] Hgemm_128x32x8.hgemm_256x64x16_nt [ OK ] Hgemm_128x32x8.hgemm_256x64x16_nt (16 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x2_nn [ OK ] Hgemm_128x32x8.hgemm_128x32x2_nn (3 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x8_nn [ OK ] Hgemm_128x32x8.hgemm_128x32x8_nn (5 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x10_nn [ OK ] Hgemm_128x32x8.hgemm_128x32x10_nn (5 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x16_nn [ OK ] Hgemm_128x32x8.hgemm_128x32x16_nn (6 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x32_nn [ OK ] Hgemm_128x32x8.hgemm_128x32x32_nn (8 ms) [ RUN ] Hgemm_128x32x8.hgemm_256x32x16_nn [ OK ] Hgemm_128x32x8.hgemm_256x32x16_nn (9 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x64x16_nn [ OK ] Hgemm_128x32x8.hgemm_128x64x16_nn (9 ms) [ RUN ] Hgemm_128x32x8.hgemm_256x64x16_nn [ OK ] Hgemm_128x32x8.hgemm_256x64x16_nn (16 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x8_tn [ OK ] Hgemm_128x32x8.hgemm_128x32x8_tn (4 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x10_tn [ OK ] Hgemm_128x32x8.hgemm_128x32x10_tn (5 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x16_tn [ OK ] Hgemm_128x32x8.hgemm_128x32x16_tn (6 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x32_tn [ OK ] Hgemm_128x32x8.hgemm_128x32x32_tn (8 ms) [ RUN ] Hgemm_128x32x8.hgemm_256x32x16_tn [ OK ] Hgemm_128x32x8.hgemm_256x32x16_tn (9 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x64x16_tn [ OK ] Hgemm_128x32x8.hgemm_128x64x16_tn (9 ms) [ RUN ] Hgemm_128x32x8.hgemm_256x64x16_tn [ OK ] Hgemm_128x32x8.hgemm_256x64x16_tn (16 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x8_tt [ OK ] Hgemm_128x32x8.hgemm_128x32x8_tt (5 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x10_tt [ OK ] Hgemm_128x32x8.hgemm_128x32x10_tt (5 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x16_tt [ OK ] Hgemm_128x32x8.hgemm_128x32x16_tt (6 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x32x32_tt [ OK ] Hgemm_128x32x8.hgemm_128x32x32_tt (8 ms) [ RUN ] Hgemm_128x32x8.hgemm_256x32x16_tt [ OK ] Hgemm_128x32x8.hgemm_256x32x16_tt (9 ms) [ RUN ] Hgemm_128x32x8.hgemm_128x64x16_tt [ OK ] Hgemm_128x32x8.hgemm_128x64x16_tt (9 ms) [ RUN ] Hgemm_128x32x8.hgemm_256x64x16_tt [ OK ] Hgemm_128x32x8.hgemm_256x64x16_tt (16 ms) [----------] 30 tests from Hgemm_128x32x8 (236 ms total)

[----------] 30 tests from Hgemm_128x64x8 [ RUN ] Hgemm_128x64x8.hgemm_128x64x1_nt [ OK ] Hgemm_128x64x8.hgemm_128x64x1_nt (4 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x8_nt [ OK ] Hgemm_128x64x8.hgemm_128x64x8_nt (7 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x9_nt [ OK ] Hgemm_128x64x8.hgemm_128x64x9_nt (7 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x16_nt [ OK ] Hgemm_128x64x8.hgemm_128x64x16_nt (9 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x64_nt [ OK ] Hgemm_128x64x8.hgemm_128x64x64_nt (23 ms) [ RUN ] Hgemm_128x64x8.hgemm_256x64x16_nt [ OK ] Hgemm_128x64x8.hgemm_256x64x16_nt (16 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x128x16_nt [ OK ] Hgemm_128x64x8.hgemm_128x128x16_nt (15 ms) [ RUN ] Hgemm_128x64x8.hgemm_256x128x16_nt [ OK ] Hgemm_128x64x8.hgemm_256x128x16_nt (29 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x2_nn [ OK ] Hgemm_128x64x8.hgemm_128x64x2_nn (5 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x8_nn [ OK ] Hgemm_128x64x8.hgemm_128x64x8_nn (6 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x10_nn [ OK ] Hgemm_128x64x8.hgemm_128x64x10_nn (7 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x16_nn [ OK ] Hgemm_128x64x8.hgemm_128x64x16_nn (8 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x64_nn [ OK ] Hgemm_128x64x8.hgemm_128x64x64_nn (23 ms) [ RUN ] Hgemm_128x64x8.hgemm_256x64x16_nn [ OK ] Hgemm_128x64x8.hgemm_256x64x16_nn (15 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x128x16_nn [ OK ] Hgemm_128x64x8.hgemm_128x128x16_nn (15 ms) [ RUN ] Hgemm_128x64x8.hgemm_256x128x16_nn [ OK ] Hgemm_128x64x8.hgemm_256x128x16_nn (28 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x8_tn [ OK ] Hgemm_128x64x8.hgemm_128x64x8_tn (7 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x10_tn [ OK ] Hgemm_128x64x8.hgemm_128x64x10_tn (7 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x16_tn [ OK ] Hgemm_128x64x8.hgemm_128x64x16_tn (9 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x64_tn [ OK ] Hgemm_128x64x8.hgemm_128x64x64_tn (21 ms) [ RUN ] Hgemm_128x64x8.hgemm_256x64x16_tn [ OK ] Hgemm_128x64x8.hgemm_256x64x16_tn (15 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x128x16_tn [ OK ] Hgemm_128x64x8.hgemm_128x128x16_tn (15 ms) [ RUN ] Hgemm_128x64x8.hgemm_256x128x16_tn [ OK ] Hgemm_128x64x8.hgemm_256x128x16_tn (28 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x8_tt [ OK ] Hgemm_128x64x8.hgemm_128x64x8_tt (7 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x10_tt [ OK ] Hgemm_128x64x8.hgemm_128x64x10_tt (7 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x16_tt [ OK ] Hgemm_128x64x8.hgemm_128x64x16_tt (9 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x64x64_tt [ OK ] Hgemm_128x64x8.hgemm_128x64x64_tt (21 ms) [ RUN ] Hgemm_128x64x8.hgemm_256x64x16_tt [ OK ] Hgemm_128x64x8.hgemm_256x64x16_tt (15 ms) [ RUN ] Hgemm_128x64x8.hgemm_128x128x16_tt [ OK ] Hgemm_128x64x8.hgemm_128x128x16_tt (15 ms) [ RUN ] Hgemm_128x64x8.hgemm_256x128x16_tt [ OK ] Hgemm_128x64x8.hgemm_256x128x16_tt (28 ms) [----------] 30 tests from Hgemm_128x64x8 (422 ms total)

[----------] 32 tests from Igemm_128x128x32 [ RUN ] Igemm_128x128x32.igemm_128x128x4_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x4_nt (7 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x32_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x32_nt (10 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x36_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x36_nt (7 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x64_nt (9 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x256_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x256_nt (21 ms) [ RUN ] Igemm_128x128x32.igemm_256x128x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_256x128x64_nt (16 ms) [ RUN ] Igemm_128x128x32.igemm_128x256x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x256x64_nt (14 ms) [ RUN ] Igemm_128x128x32.igemm_256x256x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_256x256x64_nt (30 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x4_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x4_nn (3 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x32_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x32_nn (6 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x36_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x36_nn (7 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x64_nn (8 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x256_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x256_nn (21 ms) [ RUN ] Igemm_128x128x32.igemm_256x128x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_256x128x64_nn (14 ms) [ RUN ] Igemm_128x128x32.igemm_128x256x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x256x64_nn (14 ms) [ RUN ] Igemm_128x128x32.igemm_256x256x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_256x256x64_nn (29 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x4_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x4_tn (3 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x32_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x32_tn (7 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x36_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x36_tn (6 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x64_tn (8 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x256_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x256_tn (21 ms) [ RUN ] Igemm_128x128x32.igemm_256x128x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_256x128x64_tn (15 ms) [ RUN ] Igemm_128x128x32.igemm_128x256x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x256x64_tn (13 ms) [ RUN ] Igemm_128x128x32.igemm_256x256x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_256x256x64_tn (28 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x4_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x4_tt (3 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x32_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x32_tt (5 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x36_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x36_tt (6 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x64_tt (8 ms) [ RUN ] Igemm_128x128x32.igemm_128x128x256_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x128x256_tt (21 ms) [ RUN ] Igemm_128x128x32.igemm_256x128x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_256x128x64_tt (14 ms) [ RUN ] Igemm_128x128x32.igemm_128x256x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_128x256x64_tt (14 ms) [ RUN ] Igemm_128x128x32.igemm_256x256x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x128x32.igemm_256x256x64_tt (32 ms) [----------] 32 tests from Igemm_128x128x32 (423 ms total)

[----------] 32 tests from Igemm_128x64x32 [ RUN ] Igemm_128x64x32.Igemm_128x64x4_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.Igemm_128x64x4_nt (2 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x32_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x32_nt (4 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x36_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x36_nt (4 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x64_nt (5 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x256_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x256_nt (12 ms) [ RUN ] Igemm_128x64x32.igemm_256x64x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_256x64x64_nt (9 ms) [ RUN ] Igemm_128x64x32.igemm_128x128x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x128x64_nt (8 ms) [ RUN ] Igemm_128x64x32.igemm_256x128x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_256x128x64_nt (14 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x4_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x4_nn (2 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x32_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x32_nn (4 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x36_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x36_nn (3 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x64_nn (6 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x256_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x256_nn (12 ms) [ RUN ] Igemm_128x64x32.igemm_256x64x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_256x64x64_nn (8 ms) [ RUN ] Igemm_128x64x32.igemm_128x128x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x128x64_nn (8 ms) [ RUN ] Igemm_128x64x32.igemm_256x128x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_256x128x64_nn (14 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x4_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x4_tn (2 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x32_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x32_tn (4 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x36_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x36_tn (4 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x64_tn (5 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x256_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x256_tn (12 ms) [ RUN ] Igemm_128x64x32.igemm_256x64x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_256x64x64_tn (8 ms) [ RUN ] Igemm_128x64x32.igemm_128x128x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x128x64_tn (8 ms) [ RUN ] Igemm_128x64x32.igemm_256x128x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_256x128x64_tn (14 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x4_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x4_tt (3 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x32_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x32_tt (3 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x36_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x36_tt (4 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x64_tt (5 ms) [ RUN ] Igemm_128x64x32.igemm_128x64x256_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x64x256_tt (12 ms) [ RUN ] Igemm_128x64x32.igemm_256x64x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_256x64x64_tt (9 ms) [ RUN ] Igemm_128x64x32.igemm_128x128x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_128x128x64_tt (8 ms) [ RUN ] Igemm_128x64x32.igemm_256x128x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x64x32.igemm_256x128x64_tt (14 ms) [----------] 32 tests from Igemm_128x64x32 (231 ms total)

[----------] 32 tests from Igemm_128x32x32 [ RUN ] Igemm_128x32x32.igemm_128x32x32x4_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x32x4_nt (2 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x32_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x32_nt (2 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x36_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x36_nt (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x64_nt (4 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x256_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x256_nt (7 ms) [ RUN ] Igemm_128x32x32.igemm_256x32x64_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_256x32x64_nt (5 ms) [ RUN ] Igemm_128x32x32.igemm_128x128x32_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x128x32_nt (6 ms) [ RUN ] Igemm_128x32x32.igemm_256x128x32_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_256x128x32_nt (10 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x4_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x4_nn (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x32_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x32_nn (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x36_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x36_nn (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x64_nn (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x256_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x256_nn (8 ms) [ RUN ] Igemm_128x32x32.igemm_256x32x64_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_256x32x64_nn (5 ms) [ RUN ] Igemm_128x32x32.igemm_128x128x32_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x128x32_nn (6 ms) [ RUN ] Igemm_128x32x32.igemm_256x128x32_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_256x128x32_nn (10 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x4_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x4_tn (2 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x32_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x32_tn (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x36_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x36_tn (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x64_tn (4 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x256_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x256_tn (7 ms) [ RUN ] Igemm_128x32x32.igemm_256x32x64_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_256x32x64_tn (5 ms) [ RUN ] Igemm_128x32x32.igemm_128x128x32_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x128x32_tn (6 ms) [ RUN ] Igemm_128x32x32.igemm_256x128x32_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_256x128x32_tn (10 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x4_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x4_tt (2 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x32_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x32_tt (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x36_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x36_tt (3 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x64_tt (4 ms) [ RUN ] Igemm_128x32x32.igemm_128x32x256_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x32x256_tt (7 ms) [ RUN ] Igemm_128x32x32.igemm_256x32x64_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_256x32x64_tt (5 ms) [ RUN ] Igemm_128x32x32.igemm_128x128x32_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_128x128x32_tt (6 ms) [ RUN ] Igemm_128x32x32.igemm_256x128x32_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_128x32x32.igemm_256x128x32_tt (10 ms) [----------] 32 tests from Igemm_128x32x32 (160 ms total)

[----------] 32 tests from Igemm_128x128x32_float [ RUN ] Igemm_128x128x32_float.igemm_128x128x4_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x4_nt (119 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x32_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x32_nt (115 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x36_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x36_nt (113 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x64_nt (116 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x256_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x256_nt (137 ms) [ RUN ] Igemm_128x128x32_float.igemm_256x128x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_256x128x64_nt (228 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x256x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x256x64_nt (225 ms) [ RUN ] Igemm_128x128x32_float.igemm_256x256x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_256x256x64_nt (450 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x4_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x4_nn (105 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x32_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x32_nn (110 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x36_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x36_nn (111 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x64_nn (114 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x256_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x256_nn (138 ms) [ RUN ] Igemm_128x128x32_float.igemm_256x128x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_256x128x64_nn (226 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x256x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x256x64_nn (224 ms) [ RUN ] Igemm_128x128x32_float.igemm_256x256x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_256x256x64_nn (446 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x4_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x4_tn (104 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x32_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x32_tn (109 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x36_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x36_tn (110 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x64_tn (113 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x256_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x256_tn (135 ms) [ RUN ] Igemm_128x128x32_float.igemm_256x128x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_256x128x64_tn (224 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x256x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x256x64_tn (222 ms) [ RUN ] Igemm_128x128x32_float.igemm_256x256x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_256x256x64_tn (446 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x4_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x4_tt (104 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x32_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x32_tt (109 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x36_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x36_tt (110 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x64_tt (113 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x128x256_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x128x256_tt (135 ms) [ RUN ] Igemm_128x128x32_float.igemm_256x128x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_256x128x64_tt (226 ms) [ RUN ] Igemm_128x128x32_float.igemm_128x256x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_128x256x64_tt (222 ms) [ RUN ] Igemm_128x128x32_float.igemm_256x256x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_float.igemm_256x256x64_tt (444 ms) [----------] 32 tests from Igemm_128x128x32_float (5904 ms total)

[----------] 32 tests from Igemm_128x128x32_int8 [ RUN ] Igemm_128x128x32_int8.igemm_128x128x4_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x4_nt (56 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x32_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x32_nt (58 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x36_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x36_nt (54 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x64_nt (58 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x256_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x256_nt (78 ms) [ RUN ] Igemm_128x128x32_int8.igemm_256x128x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_256x128x64_nt (110 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x256x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x256x64_nt (110 ms) [ RUN ] Igemm_128x128x32_int8.igemm_256x256x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_256x256x64_nt (214 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x4_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x4_nn (50 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x32_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x32_nn (54 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x36_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x36_nn (54 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x64_nn (57 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x256_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x256_nn (79 ms) [ RUN ] Igemm_128x128x32_int8.igemm_256x128x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_256x128x64_nn (110 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x256x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x256x64_nn (110 ms) [ RUN ] Igemm_128x128x32_int8.igemm_256x256x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_256x256x64_nn (214 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x4_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x4_tn (50 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x32_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x32_tn (55 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x36_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x36_tn (53 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x64_tn (56 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x256_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x256_tn (77 ms) [ RUN ] Igemm_128x128x32_int8.igemm_256x128x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_256x128x64_tn (108 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x256x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x256x64_tn (108 ms) [ RUN ] Igemm_128x128x32_int8.igemm_256x256x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_256x256x64_tn (209 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x4_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x4_tt (51 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x32_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x32_tt (56 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x36_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x36_tt (53 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x64_tt (55 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x128x256_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x256_tt (76 ms) [ RUN ] Igemm_128x128x32_int8.igemm_256x128x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_256x128x64_tt (108 ms) [ RUN ] Igemm_128x128x32_int8.igemm_128x256x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_128x256x64_tt (107 ms) [ RUN ] Igemm_128x128x32_int8.igemm_256x256x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:97: Failure Value of: testbed.verify_with_host() Actual: false Expected: true [ FAILED ] Igemm_128x128x32_int8.igemm_256x256x64_tt (209 ms) [----------] 32 tests from Igemm_128x128x32_int8 (2898 ms total)

[----------] 16 tests from Igemm_32x32x128 [ RUN ] Igemm_32x32x128.igemm_32x32x4_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x4_nt (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x8_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x8_nt (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x32_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x32_nt (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x128_nt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x128_nt (3 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x4_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x4_nn (1 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x8_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x8_nn (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x32_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x32_nn (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x128_nn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x128_nn (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x4_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x4_tn (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x8_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x8_tn (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x15_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x15_tn (1 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x32_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x32_tn (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x128_tn unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x128_tn (3 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x8_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x8_tt (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x32_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x32_tt (2 ms) [ RUN ] Igemm_32x32x128.igemm_32x32x128_tt unknown file: Failure C++ exception with description "compute_cublas() failed" thrown in the test body. [ FAILED ] Igemm_32x32x128.igemm_32x32x128_tt (4 ms) [----------] 16 tests from Igemm_32x32x128 (34 ms total)

[----------] 36 tests from Sgemm_128x128x8 [ RUN ] Sgemm_128x128x8.sgemm_128x81x1_nt [ OK ] Sgemm_128x128x8.sgemm_128x81x1_nt (4 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x112x8_nt [ OK ] Sgemm_128x128x8.sgemm_128x112x8_nt (7 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x112x9_nt [ OK ] Sgemm_128x128x8.sgemm_128x112x9_nt (7 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x73x16_nt [ OK ] Sgemm_128x128x8.sgemm_128x73x16_nt (6 ms) [ RUN ] Sgemm_128x128x8.sgemm_97x112x64_nt [ OK ] Sgemm_128x128x8.sgemm_97x112x64_nt (12 ms) [ RUN ] Sgemm_128x128x8.sgemm_256x112x16_nt [ OK ] Sgemm_128x128x8.sgemm_256x112x16_nt (13 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x240x16_nt [ OK ] Sgemm_128x128x8.sgemm_128x240x16_nt (11 ms) [ RUN ] Sgemm_128x128x8.sgemm_256x240x16_nt [ OK ] Sgemm_128x128x8.sgemm_256x240x16_nt (24 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x112x1_nn [ OK ] Sgemm_128x128x8.sgemm_128x112x1_nn (4 ms) [ RUN ] Sgemm_128x128x8.sgemm_79x112x8_nn [ OK ] Sgemm_128x128x8.sgemm_79x112x8_nn (4 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x81x9_nn [ OK ] Sgemm_128x128x8.sgemm_128x81x9_nn (5 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x112x16_nn [ OK ] Sgemm_128x128x8.sgemm_128x112x16_nn (6 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x73x64_nn [ OK ] Sgemm_128x128x8.sgemm_128x73x64_nn (9 ms) [ RUN ] Sgemm_128x128x8.sgemm_256x112x16_nn [ OK ] Sgemm_128x128x8.sgemm_256x112x16_nn (12 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x256x16_nn [ OK ] Sgemm_128x128x8.sgemm_128x256x16_nn (11 ms) [ RUN ] Sgemm_128x128x8.sgemm_256x256x16_nn [ OK ] Sgemm_128x128x8.sgemm_256x256x16_nn (24 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x128x1_tn [ OK ] Sgemm_128x128x8.sgemm_128x128x1_tn (5 ms) [ RUN ] Sgemm_128x128x8.sgemm_127x112x8_tn [ OK ] Sgemm_128x128x8.sgemm_127x112x8_tn (5 ms) [ RUN ] Sgemm_128x128x8.sgemm_21x112x9_tn [ OK ] Sgemm_128x128x8.sgemm_21x112x9_tn (3 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x73x16_tn [ OK ] Sgemm_128x128x8.sgemm_128x73x16_tn (5 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x81x64_tn [ OK ] Sgemm_128x128x8.sgemm_128x81x64_tn (9 ms) [ RUN ] Sgemm_128x128x8.sgemm_256x112x16_tn [ OK ] Sgemm_128x128x8.sgemm_256x112x16_tn (12 ms) [ RUN ] Sgemm_128x128x8.sgemm_47x256x16_tn [ OK ] Sgemm_128x128x8.sgemm_47x256x16_tn (6 ms) [ RUN ] Sgemm_128x128x8.sgemm_211x256x16_tn [ OK ] Sgemm_128x128x8.sgemm_211x256x16_tn (16 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x128x1_tt [ OK ] Sgemm_128x128x8.sgemm_128x128x1_tt (5 ms) [ RUN ] Sgemm_128x128x8.sgemm_109x112x8_tt [ OK ] Sgemm_128x128x8.sgemm_109x112x8_tt (5 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x112x9_tt [ OK ] Sgemm_128x128x8.sgemm_128x112x9_tt (5 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x112x16_tt [ OK ] Sgemm_128x128x8.sgemm_128x112x16_tt (6 ms) [ RUN ] Sgemm_128x128x8.sgemm_123x112x64_tt [ OK ] Sgemm_128x128x8.sgemm_123x112x64_tt (9 ms) [ RUN ] Sgemm_128x128x8.sgemm_256x112x16_tt [ OK ] Sgemm_128x128x8.sgemm_256x112x16_tt (11 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x256x16_tt [ OK ] Sgemm_128x128x8.sgemm_128x256x16_tt (10 ms) [ RUN ] Sgemm_128x128x8.sgemm_256x256x16_tt [ OK ] Sgemm_128x128x8.sgemm_256x256x16_tt (23 ms) [ RUN ] Sgemm_128x128x8.sgemm_120x112x64_ldg4_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x8.sgemm_120x112x64_ldg4_nt (99 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x128x16_alpha2_nt [ OK ] Sgemm_128x128x8.sgemm_128x128x16_alpha2_nt (6 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x112x16_beta1_nt [ OK ] Sgemm_128x128x8.sgemm_128x112x16_beta1_nt (6 ms) [ RUN ] Sgemm_128x128x8.sgemm_128x112x16_alpha2_beta1_nt [ OK ] Sgemm_128x128x8.sgemm_128x112x16_alpha2_beta1_nt (6 ms) [----------] 36 tests from Sgemm_128x128x8 (414 ms total)

[----------] 40 tests from Sgemm_128x128x16 [ RUN ] Sgemm_128x128x16.sgemm_128x128x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_nt (102 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x81x1_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x81x1_nt (58 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x112x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_nt (91 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x112x17_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x112x17_nt (90 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x73x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x73x16_nt (60 ms) [ RUN ] Sgemm_128x128x16.sgemm_97x112x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_97x112x64_nt (81 ms) [ RUN ] Sgemm_128x128x16.sgemm_256x112x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_256x112x16_nt (181 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x240x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x240x16_nt (189 ms) [ RUN ] Sgemm_128x128x16.sgemm_256x240x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_256x240x16_nt (380 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x128x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_nn (101 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x112x1_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x112x1_nn (79 ms) [ RUN ] Sgemm_128x128x16.sgemm_79x112x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_79x112x16_nn (57 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x81x17_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x81x17_nn (66 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x112x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_nn (90 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x73x64_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x73x64_nn (69 ms) [ RUN ] Sgemm_128x128x16.sgemm_256x112x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_256x112x16_nn (178 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x256x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x256x16_nn (201 ms) [ RUN ] Sgemm_128x128x16.sgemm_256x256x16_nn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_256x256x16_nn (402 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x128x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_tn (102 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x128x1_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x128x1_tn (90 ms) [ RUN ] Sgemm_128x128x16.sgemm_127x112x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_127x112x16_tn (88 ms) [ RUN ] Sgemm_128x128x16.sgemm_21x112x17_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_21x112x17_tn (18 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x73x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x73x16_tn (60 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x81x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x81x64_tn (76 ms) [ RUN ] Sgemm_128x128x16.sgemm_256x112x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_256x112x16_tn (179 ms) [ RUN ] Sgemm_128x128x16.sgemm_47x256x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_47x256x16_tn (76 ms) [ RUN ] Sgemm_128x128x16.sgemm_211x256x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_211x256x16_tn (326 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x128x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_tt (102 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x128x1_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x128x1_tt (89 ms) [ RUN ] Sgemm_128x128x16.sgemm_109x112x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_109x112x16_tt (77 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x112x17_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x112x17_tt (90 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x112x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_tt (90 ms) [ RUN ] Sgemm_128x128x16.sgemm_123x112x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_123x112x64_tt (97 ms) [ RUN ] Sgemm_128x128x16.sgemm_256x112x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_256x112x16_tt (179 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x256x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x256x16_tt (199 ms) [ RUN ] Sgemm_128x128x16.sgemm_256x256x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_256x256x16_tt (401 ms) [ RUN ] Sgemm_128x128x16.sgemm_120x112x64_ldg4_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_120x112x64_ldg4_nt (96 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x128x16_alpha2_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_alpha2_nt (105 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x112x16_beta1_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_beta1_nt (91 ms) [ RUN ] Sgemm_128x128x16.sgemm_128x112x16_alpha2_beta1_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_alpha2_beta1_nt (110 ms) [----------] 40 tests from Sgemm_128x128x16 (5219 ms total)

[----------] 34 tests from Sgemm_128x64x8 [ RUN ] Sgemm_128x64x8.sgemm_128x64x1_nt [ OK ] Sgemm_128x64x8.sgemm_128x64x1_nt (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x8_nt [ OK ] Sgemm_128x64x8.sgemm_128x64x8_nt (3 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x9_nt [ OK ] Sgemm_128x64x8.sgemm_128x64x9_nt (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x16_nt [ OK ] Sgemm_128x64x8.sgemm_128x64x16_nt (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x64_nt [ OK ] Sgemm_128x64x8.sgemm_128x64x64_nt (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_256x64x16_nt [ OK ] Sgemm_128x64x8.sgemm_256x64x16_nt (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x128x16_nt [ OK ] Sgemm_128x64x8.sgemm_128x128x16_nt (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_256x128x16_nt [ OK ] Sgemm_128x64x8.sgemm_256x128x16_nt (14 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x1_nn [ OK ] Sgemm_128x64x8.sgemm_128x64x1_nn (3 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x8_nn [ OK ] Sgemm_128x64x8.sgemm_128x64x8_nn (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x9_nn [ OK ] Sgemm_128x64x8.sgemm_128x64x9_nn (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x16_nn [ OK ] Sgemm_128x64x8.sgemm_128x64x16_nn (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x64_nn [ OK ] Sgemm_128x64x8.sgemm_128x64x64_nn (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_256x64x16_nn [ OK ] Sgemm_128x64x8.sgemm_256x64x16_nn (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x128x16_nn [ OK ] Sgemm_128x64x8.sgemm_128x128x16_nn (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_256x128x16_nn [ OK ] Sgemm_128x64x8.sgemm_256x128x16_nn (14 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x1_tn [ OK ] Sgemm_128x64x8.sgemm_128x64x1_tn (5 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x8_tn [ OK ] Sgemm_128x64x8.sgemm_128x64x8_tn (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x9_tn [ OK ] Sgemm_128x64x8.sgemm_128x64x9_tn (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x16_tn [ OK ] Sgemm_128x64x8.sgemm_128x64x16_tn (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x64_tn [ OK ] Sgemm_128x64x8.sgemm_128x64x64_tn (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_256x64x16_tn [ OK ] Sgemm_128x64x8.sgemm_256x64x16_tn (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x128x16_tn [ OK ] Sgemm_128x64x8.sgemm_128x128x16_tn (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_256x128x16_tn [ OK ] Sgemm_128x64x8.sgemm_256x128x16_tn (14 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x1_tt [ OK ] Sgemm_128x64x8.sgemm_128x64x1_tt (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x8_tt [ OK ] Sgemm_128x64x8.sgemm_128x64x8_tt (3 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x9_tt [ OK ] Sgemm_128x64x8.sgemm_128x64x9_tt (4 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x16_tt [ OK ] Sgemm_128x64x8.sgemm_128x64x16_tt (5 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x64_tt [ OK ] Sgemm_128x64x8.sgemm_128x64x64_tt (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_256x64x16_tt [ OK ] Sgemm_128x64x8.sgemm_256x64x16_tt (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x128x16_tt [ OK ] Sgemm_128x64x8.sgemm_128x128x16_tt (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_256x128x16_tt [ OK ] Sgemm_128x64x8.sgemm_256x128x16_tt (14 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x64_8x4_accumulators_nt [ OK ] Sgemm_128x64x8.sgemm_128x64x64_8x4_accumulators_nt (7 ms) [ RUN ] Sgemm_128x64x8.sgemm_128x64x64_4x8_accumulators_nt [ OK ] Sgemm_128x64x8.sgemm_128x64x64_4x8_accumulators_nt (7 ms) [----------] 34 tests from Sgemm_128x64x8 (219 ms total)

[----------] 27 tests from Sgemm_128x64x16 [ RUN ] Sgemm_128x64x16.sgemm_128x64x1_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x1_nt (45 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x16_nt (53 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x17_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x17_nt (53 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x64_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x64_nt (61 ms) [ RUN ] Sgemm_128x64x16.sgemm_256x64x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_256x64x16_nt (104 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x128x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x128x16_nt (108 ms) [ RUN ] Sgemm_128x64x16.sgemm_256x128x16_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_256x128x16_nt (203 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x1_nn [ OK ] Sgemm_128x64x16.sgemm_128x64x1_nn (3 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x8_nn [ OK ] Sgemm_128x64x16.sgemm_128x64x8_nn (5 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x17_nn [ OK ] Sgemm_128x64x16.sgemm_128x64x17_nn (4 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x64_nn [ OK ] Sgemm_128x64x16.sgemm_128x64x64_nn (6 ms) [ RUN ] Sgemm_128x64x16.sgemm_256x64x16_nn [ OK ] Sgemm_128x64x16.sgemm_256x64x16_nn (7 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x128x16_nn [ OK ] Sgemm_128x64x16.sgemm_128x128x16_nn (6 ms) [ RUN ] Sgemm_128x64x16.sgemm_256x128x16_nn [ OK ] Sgemm_128x64x16.sgemm_256x128x16_nn (14 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x1_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x1_tn (91 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x16_tn (52 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x17_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x17_tn (53 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x64_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x64_tn (60 ms) [ RUN ] Sgemm_128x64x16.sgemm_256x64x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_256x64x16_tn (102 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x128x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x128x16_tn (103 ms) [ RUN ] Sgemm_128x64x16.sgemm_256x128x16_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_256x128x16_tn (203 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x1_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x1_tt (90 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x16_tt (53 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x17_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x17_tt (53 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x64x64_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x64x64_tt (60 ms) [ RUN ] Sgemm_128x64x16.sgemm_128x128x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_128x128x16_tt (103 ms) [ RUN ] Sgemm_128x64x16.sgemm_256x128x16_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x64x16.sgemm_256x128x16_tt (203 ms) [----------] 27 tests from Sgemm_128x64x16 (1902 ms total)

[----------] 32 tests from Sgemm_128x32x8 [ RUN ] Sgemm_128x32x8.sgemm_128x32x1_nt [ OK ] Sgemm_128x32x8.sgemm_128x32x1_nt (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x8_nt [ OK ] Sgemm_128x32x8.sgemm_128x32x8_nt (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x9_nt [ OK ] Sgemm_128x32x8.sgemm_128x32x9_nt (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x16_nt [ OK ] Sgemm_128x32x8.sgemm_128x32x16_nt (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x32_nt [ OK ] Sgemm_128x32x8.sgemm_128x32x32_nt (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_256x32x16_nt [ OK ] Sgemm_128x32x8.sgemm_256x32x16_nt (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x64x16_nt [ OK ] Sgemm_128x32x8.sgemm_128x64x16_nt (5 ms) [ RUN ] Sgemm_128x32x8.sgemm_256x64x16_nt [ OK ] Sgemm_128x32x8.sgemm_256x64x16_nt (6 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x1_nn [ OK ] Sgemm_128x32x8.sgemm_128x32x1_nn (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x8_nn [ OK ] Sgemm_128x32x8.sgemm_128x32x8_nn (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x9_nn [ OK ] Sgemm_128x32x8.sgemm_128x32x9_nn (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x16_nn [ OK ] Sgemm_128x32x8.sgemm_128x32x16_nn (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x32_nn [ OK ] Sgemm_128x32x8.sgemm_128x32x32_nn (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_256x32x16_nn [ OK ] Sgemm_128x32x8.sgemm_256x32x16_nn (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x64x16_nn [ OK ] Sgemm_128x32x8.sgemm_128x64x16_nn (5 ms) [ RUN ] Sgemm_128x32x8.sgemm_256x64x16_nn [ OK ] Sgemm_128x32x8.sgemm_256x64x16_nn (7 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x1_tn [ OK ] Sgemm_128x32x8.sgemm_128x32x1_tn (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x8_tn [ OK ] Sgemm_128x32x8.sgemm_128x32x8_tn (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x9_tn [ OK ] Sgemm_128x32x8.sgemm_128x32x9_tn (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x16_tn [ OK ] Sgemm_128x32x8.sgemm_128x32x16_tn (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x32_tn [ OK ] Sgemm_128x32x8.sgemm_128x32x32_tn (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_256x32x16_tn [ OK ] Sgemm_128x32x8.sgemm_256x32x16_tn (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x64x16_tn [ OK ] Sgemm_128x32x8.sgemm_128x64x16_tn (5 ms) [ RUN ] Sgemm_128x32x8.sgemm_256x64x16_tn [ OK ] Sgemm_128x32x8.sgemm_256x64x16_tn (7 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x1_tt [ OK ] Sgemm_128x32x8.sgemm_128x32x1_tt (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x8_tt [ OK ] Sgemm_128x32x8.sgemm_128x32x8_tt (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x9_tt [ OK ] Sgemm_128x32x8.sgemm_128x32x9_tt (3 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x16_tt [ OK ] Sgemm_128x32x8.sgemm_128x32x16_tt (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x32x32_tt [ OK ] Sgemm_128x32x8.sgemm_128x32x32_tt (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_256x32x16_tt [ OK ] Sgemm_128x32x8.sgemm_256x32x16_tt (4 ms) [ RUN ] Sgemm_128x32x8.sgemm_128x64x16_tt [ OK ] Sgemm_128x32x8.sgemm_128x64x16_tt (5 ms) [ RUN ] Sgemm_128x32x8.sgemm_256x64x16_tt [ OK ] Sgemm_128x32x8.sgemm_256x64x16_tt (7 ms) [----------] 32 tests from Sgemm_128x32x8 (133 ms total)

[----------] 28 tests from Sgemm_128x32x16 [ RUN ] Sgemm_128x32x16.sgemm_128x32x1_nt [ OK ] Sgemm_128x32x16.sgemm_128x32x1_nt (2 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x16_nt [ OK ] Sgemm_128x32x16.sgemm_128x32x16_nt (3 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x17_nt [ OK ] Sgemm_128x32x16.sgemm_128x32x17_nt (3 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x32_nt [ OK ] Sgemm_128x32x16.sgemm_128x32x32_nt (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_256x32x16_nt [ OK ] Sgemm_128x32x16.sgemm_256x32x16_nt (5 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x64x16_nt [ OK ] Sgemm_128x32x16.sgemm_128x64x16_nt (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_256x64x16_nt [ OK ] Sgemm_128x32x16.sgemm_256x64x16_nt (7 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x1_nn [ OK ] Sgemm_128x32x16.sgemm_128x32x1_nn (3 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x16_nn [ OK ] Sgemm_128x32x16.sgemm_128x32x16_nn (3 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x17_nn [ OK ] Sgemm_128x32x16.sgemm_128x32x17_nn (3 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x32_nn [ OK ] Sgemm_128x32x16.sgemm_128x32x32_nn (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_256x32x16_nn [ OK ] Sgemm_128x32x16.sgemm_256x32x16_nn (5 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x64x16_nn [ OK ] Sgemm_128x32x16.sgemm_128x64x16_nn (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_256x64x16_nn [ OK ] Sgemm_128x32x16.sgemm_256x64x16_nn (7 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x1_tn /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x32x16.sgemm_128x32x1_tn (90 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x16_tn [ OK ] Sgemm_128x32x16.sgemm_128x32x16_tn (3 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x17_tn [ OK ] Sgemm_128x32x16.sgemm_128x32x17_tn (3 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x32_tn [ OK ] Sgemm_128x32x16.sgemm_128x32x32_tn (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_256x32x16_tn [ OK ] Sgemm_128x32x16.sgemm_256x32x16_tn (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x64x16_tn [ OK ] Sgemm_128x32x16.sgemm_128x64x16_tn (5 ms) [ RUN ] Sgemm_128x32x16.sgemm_256x64x16_tn [ OK ] Sgemm_128x32x16.sgemm_256x64x16_tn (6 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x1_tt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_128x32x16.sgemm_128x32x1_tt (90 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x16_tt [ OK ] Sgemm_128x32x16.sgemm_128x32x16_tt (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x17_tt [ OK ] Sgemm_128x32x16.sgemm_128x32x17_tt (3 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x32x32_tt [ OK ] Sgemm_128x32x16.sgemm_128x32x32_tt (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_256x32x16_tt [ OK ] Sgemm_128x32x16.sgemm_256x32x16_tt (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_128x64x16_tt [ OK ] Sgemm_128x32x16.sgemm_128x64x16_tt (4 ms) [ RUN ] Sgemm_128x32x16.sgemm_256x64x16_tt [ OK ] Sgemm_128x32x16.sgemm_256x64x16_tt (7 ms) [----------] 28 tests from Sgemm_128x32x16 (290 ms total)

[----------] 1 test from Sgemm_64x128x8 [ RUN ] Sgemm_64x128x8.sgemm_64x128x64_4x8_accumulators_nt [ OK ] Sgemm_64x128x8.sgemm_64x128x64_4x8_accumulators_nt (10 ms) [----------] 1 test from Sgemm_64x128x8 (10 ms total)

[----------] 1 test from Sgemm_64x128x16 [ RUN ] Sgemm_64x128x16.sgemm_64x128x64_4x8_accumulators_nt /home/nvidia/Documents/cutlass/tools/test/unit/gemm/gemm.h:95: Failure Value of: testbed.verify_with_cublas() Actual: false Expected: true [ FAILED ] Sgemm_64x128x16.sgemm_64x128x64_4x8_accumulators_nt (64 ms) [----------] 1 test from Sgemm_64x128x16 (64 ms total)

[----------] 32 tests from Sgemm_64x64x8 [ RUN ] Sgemm_64x64x8.sgemm_64x64x1_nt [ OK ] Sgemm_64x64x8.sgemm_64x64x1_nt (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x8_nt [ OK ] Sgemm_64x64x8.sgemm_64x64x8_nt (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x9_nt [ OK ] Sgemm_64x64x8.sgemm_64x64x9_nt (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x16_nt [ OK ] Sgemm_64x64x8.sgemm_64x64x16_nt (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x64_nt [ OK ] Sgemm_64x64x8.sgemm_64x64x64_nt (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_128x64x16_nt [ OK ] Sgemm_64x64x8.sgemm_128x64x16_nt (5 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x128x16_nt [ OK ] Sgemm_64x64x8.sgemm_64x128x16_nt (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_128x128x16_nt [ OK ] Sgemm_64x64x8.sgemm_128x128x16_nt (6 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x1_nn [ OK ] Sgemm_64x64x8.sgemm_64x64x1_nn (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x8_nn [ OK ] Sgemm_64x64x8.sgemm_64x64x8_nn (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x9_nn [ OK ] Sgemm_64x64x8.sgemm_64x64x9_nn (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x16_nn [ OK ] Sgemm_64x64x8.sgemm_64x64x16_nn (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x64_nn [ OK ] Sgemm_64x64x8.sgemm_64x64x64_nn (5 ms) [ RUN ] Sgemm_64x64x8.sgemm_128x64x16_nn [ OK ] Sgemm_64x64x8.sgemm_128x64x16_nn (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x128x16_nn [ OK ] Sgemm_64x64x8.sgemm_64x128x16_nn (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_128x128x16_nn [ OK ] Sgemm_64x64x8.sgemm_128x128x16_nn (6 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x1_tn [ OK ] Sgemm_64x64x8.sgemm_64x64x1_tn (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x8_tn [ OK ] Sgemm_64x64x8.sgemm_64x64x8_tn (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x9_tn [ OK ] Sgemm_64x64x8.sgemm_64x64x9_tn (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x16_tn [ OK ] Sgemm_64x64x8.sgemm_64x64x16_tn (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x64_tn [ OK ] Sgemm_64x64x8.sgemm_64x64x64_tn (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_128x64x16_tn [ OK ] Sgemm_64x64x8.sgemm_128x64x16_tn (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x128x16_tn [ OK ] Sgemm_64x64x8.sgemm_64x128x16_tn (5 ms) [ RUN ] Sgemm_64x64x8.sgemm_128x128x16_tn [ OK ] Sgemm_64x64x8.sgemm_128x128x16_tn (6 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x1_tt [ OK ] Sgemm_64x64x8.sgemm_64x64x1_tt (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x8_tt [ OK ] Sgemm_64x64x8.sgemm_64x64x8_tt (2 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x9_tt [ OK ] Sgemm_64x64x8.sgemm_64x64x9_tt (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x16_tt [ OK ] Sgemm_64x64x8.sgemm_64x64x16_tt (3 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x64x64_tt [ OK ] Sgemm_64x64x8.sgemm_64x64x64_tt (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_128x64x16_tt [ OK ] Sgemm_64x64x8.sgemm_128x64x16_tt (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_64x128x16_tt [ OK ] Sgemm_64x64x8.sgemm_64x128x16_tt (4 ms) [ RUN ] Sgemm_64x64x8.sgemm_128x128x16_tt [ OK ] Sgemm_64x64x8.sgemm_128x128x16_tt (6 ms) [----------] 32 tests from Sgemm_64x64x8 (125 ms total)

[----------] 28 tests from Sgemm_64x64x16 [ RUN ] Sgemm_64x64x16.sgemm_64x64x1_nt [ OK ] Sgemm_64x64x16.sgemm_64x64x1_nt (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x16_nt [ OK ] Sgemm_64x64x16.sgemm_64x64x16_nt (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x17_nt [ OK ] Sgemm_64x64x16.sgemm_64x64x17_nt (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x64_nt [ OK ] Sgemm_64x64x16.sgemm_64x64x64_nt (4 ms) [ RUN ] Sgemm_64x64x16.sgemm_128x64x16_nt [ OK ] Sgemm_64x64x16.sgemm_128x64x16_nt (4 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x128x16_nt [ OK ] Sgemm_64x64x16.sgemm_64x128x16_nt (5 ms) [ RUN ] Sgemm_64x64x16.sgemm_128x128x16_nt [ OK ] Sgemm_64x64x16.sgemm_128x128x16_nt (6 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x1_nn [ OK ] Sgemm_64x64x16.sgemm_64x64x1_nn (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x16_nn [ OK ] Sgemm_64x64x16.sgemm_64x64x16_nn (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x17_nn [ OK ] Sgemm_64x64x16.sgemm_64x64x17_nn (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x64_nn [ OK ] Sgemm_64x64x16.sgemm_64x64x64_nn (4 ms) [ RUN ] Sgemm_64x64x16.sgemm_128x64x16_nn [ OK ] Sgemm_64x64x16.sgemm_128x64x16_nn (4 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x128x16_nn [ OK ] Sgemm_64x64x16.sgemm_64x128x16_nn (4 ms) [ RUN ] Sgemm_64x64x16.sgemm_128x128x16_nn [ OK ] Sgemm_64x64x16.sgemm_128x128x16_nn (6 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x1_tn [ OK ] Sgemm_64x64x16.sgemm_64x64x1_tn (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x16_tn [ OK ] Sgemm_64x64x16.sgemm_64x64x16_tn (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x17_tn [ OK ] Sgemm_64x64x16.sgemm_64x64x17_tn (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x64_tn [ OK ] Sgemm_64x64x16.sgemm_64x64x64_tn (4 ms) [ RUN ] Sgemm_64x64x16.sgemm_128x64x16_tn [ OK ] Sgemm_64x64x16.sgemm_128x64x16_tn (5 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x128x16_tn [ OK ] Sgemm_64x64x16.sgemm_64x128x16_tn (4 ms) [ RUN ] Sgemm_64x64x16.sgemm_128x128x16_tn [ OK ] Sgemm_64x64x16.sgemm_128x128x16_tn (7 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x1_tt [ OK ] Sgemm_64x64x16.sgemm_64x64x1_tt (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x16_tt [ OK ] Sgemm_64x64x16.sgemm_64x64x16_tt (4 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x17_tt [ OK ] Sgemm_64x64x16.sgemm_64x64x17_tt (3 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x64x64_tt [ OK ] Sgemm_64x64x16.sgemm_64x64x64_tt (6 ms) [ RUN ] Sgemm_64x64x16.sgemm_128x64x16_tt [ OK ] Sgemm_64x64x16.sgemm_128x64x16_tt (5 ms) [ RUN ] Sgemm_64x64x16.sgemm_64x128x16_tt [ OK ] Sgemm_64x64x16.sgemm_64x128x16_tt (5 ms) [ RUN ] Sgemm_64x64x16.sgemm_128x128x16_tt [ OK ] Sgemm_64x64x16.sgemm_128x128x16_tt (5 ms) [----------] 28 tests from Sgemm_64x64x16 (117 ms total)

[----------] 31 tests from Sgemm_64x32x8 [ RUN ] Sgemm_64x32x8.sgemm_64x32x1_nt [ OK ] Sgemm_64x32x8.sgemm_64x32x1_nt (2 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x8_nt [ OK ] Sgemm_64x32x8.sgemm_64x32x8_nt (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x9_nt [ OK ] Sgemm_64x32x8.sgemm_64x32x9_nt (2 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x16_nt [ OK ] Sgemm_64x32x8.sgemm_64x32x16_nt (2 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x64_nt [ OK ] Sgemm_64x32x8.sgemm_64x32x64_nt (4 ms) [ RUN ] Sgemm_64x32x8.sgemm_128x32x16_nt [ OK ] Sgemm_64x32x8.sgemm_128x32x16_nt (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x64x16_nt [ OK ] Sgemm_64x32x8.sgemm_64x64x16_nt (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_128x64x16_nt [ OK ] Sgemm_64x32x8.sgemm_128x64x16_nt (4 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x1_nn [ OK ] Sgemm_64x32x8.sgemm_64x32x1_nn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x8_nn [ OK ] Sgemm_64x32x8.sgemm_64x32x8_nn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x9_nn [ OK ] Sgemm_64x32x8.sgemm_64x32x9_nn (2 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x16_nn [ OK ] Sgemm_64x32x8.sgemm_64x32x16_nn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x64_nn [ OK ] Sgemm_64x32x8.sgemm_64x32x64_nn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_128x32x16_nn [ OK ] Sgemm_64x32x8.sgemm_128x32x16_nn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x64x16_nn [ OK ] Sgemm_64x32x8.sgemm_64x64x16_nn (4 ms) [ RUN ] Sgemm_64x32x8.sgemm_128x64x16_nn [ OK ] Sgemm_64x32x8.sgemm_128x64x16_nn (4 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x8_tn [ OK ] Sgemm_64x32x8.sgemm_64x32x8_tn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x9_tn [ OK ] Sgemm_64x32x8.sgemm_64x32x9_tn (2 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x16_tn [ OK ] Sgemm_64x32x8.sgemm_64x32x16_tn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x64_tn [ OK ] Sgemm_64x32x8.sgemm_64x32x64_tn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_128x32x16_tn [ OK ] Sgemm_64x32x8.sgemm_128x32x16_tn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x64x16_tn [ OK ] Sgemm_64x32x8.sgemm_64x64x16_tn (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_128x64x16_tn [ OK ] Sgemm_64x32x8.sgemm_128x64x16_tn (4 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x64x1_tt [ OK ] Sgemm_64x32x8.sgemm_64x64x1_tt (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x8_tt [ OK ] Sgemm_64x32x8.sgemm_64x32x8_tt (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x9_tt [ OK ] Sgemm_64x32x8.sgemm_64x32x9_tt (2 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x16_tt [ OK ] Sgemm_64x32x8.sgemm_64x32x16_tt (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x32x64_tt [ OK ] Sgemm_64x32x8.sgemm_64x32x64_tt (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_128x32x16_tt [ OK ] Sgemm_64x32x8.sgemm_128x32x16_tt (4 ms) [ RUN ] Sgemm_64x32x8.sgemm_64x64x16_tt [ OK ] Sgemm_64x32x8.sgemm_64x64x16_tt (3 ms) [ RUN ] Sgemm_64x32x8.sgemm_128x64x16_tt [ OK ] Sgemm_64x32x8.sgemm_128x64x16_tt (4 ms) [----------] 31 tests from Sgemm_64x32x8 (96 ms total)

[----------] 26 tests from Sgemm_64x32x16 [ RUN ] Sgemm_64x32x16.sgemm_64x32x1_nt [ OK ] Sgemm_64x32x16.sgemm_64x32x1_nt (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x16_nt [ OK ] Sgemm_64x32x16.sgemm_64x32x16_nt (2 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x17_nt [ OK ] Sgemm_64x32x16.sgemm_64x32x17_nt (2 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x64_nt [ OK ] Sgemm_64x32x16.sgemm_64x32x64_nt (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_128x32x16_nt [ OK ] Sgemm_64x32x16.sgemm_128x32x16_nt (4 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x64x16_nt [ OK ] Sgemm_64x32x16.sgemm_64x64x16_nt (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_128x64x16_nt [ OK ] Sgemm_64x32x16.sgemm_128x64x16_nt (4 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x1_nn [ OK ] Sgemm_64x32x16.sgemm_64x32x1_nn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x16_nn [ OK ] Sgemm_64x32x16.sgemm_64x32x16_nn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x17_nn [ OK ] Sgemm_64x32x16.sgemm_64x32x17_nn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x64_nn [ OK ] Sgemm_64x32x16.sgemm_64x32x64_nn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_128x32x16_nn [ OK ] Sgemm_64x32x16.sgemm_128x32x16_nn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x64x16_nn [ OK ] Sgemm_64x32x16.sgemm_64x64x16_nn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_128x64x16_nn [ OK ] Sgemm_64x32x16.sgemm_128x64x16_nn (4 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x16_tn [ OK ] Sgemm_64x32x16.sgemm_64x32x16_tn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x17_tn [ OK ] Sgemm_64x32x16.sgemm_64x32x17_tn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x64_tn [ OK ] Sgemm_64x32x16.sgemm_64x32x64_tn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_128x32x16_tn [ OK ] Sgemm_64x32x16.sgemm_128x32x16_tn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x64x16_tn [ OK ] Sgemm_64x32x16.sgemm_64x64x16_tn (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_128x64x16_tn [ OK ] Sgemm_64x32x16.sgemm_128x64x16_tn (4 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x64x1_tt [ OK ] Sgemm_64x32x16.sgemm_64x64x1_tt (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x16_tt [ OK ] Sgemm_64x32x16.sgemm_64x32x16_tt (2 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x32x17_tt [ OK ] Sgemm_64x32x16.sgemm_64x32x17_tt (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_128x32x16_tt [ OK ] Sgemm_64x32x16.sgemm_128x32x16_tt (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_64x64x16_tt [ OK ] Sgemm_64x32x16.sgemm_64x64x16_tt (3 ms) [ RUN ] Sgemm_64x32x16.sgemm_128x64x16_tt [ OK ] Sgemm_64x32x16.sgemm_128x64x16_tt (5 ms) [----------] 26 tests from Sgemm_64x32x16 (83 ms total)

[----------] Global test environment tear-down [==========] 684 tests from 33 test cases ran. (33523 ms total) [ PASSED ] 404 tests. [ FAILED ] 280 tests, listed below: [ FAILED ] Dgemm_128x128x8.dgemm_128x128x8_nt [ FAILED ] Dgemm_128x128x8.dgemm_512x256x64_nt [ FAILED ] Dgemm_128x128x8.dgemm_128x128x8_nn [ FAILED ] Dgemm_128x128x8.dgemm_512x256x64_nn [ FAILED ] Dgemm_128x128x8.dgemm_128x128x8_tn [ FAILED ] Dgemm_128x128x8.dgemm_512x256x64_tn [ FAILED ] Dgemm_128x128x8.dgemm_128x128x8_tt [ FAILED ] Dgemm_128x128x8.dgemm_512x256x64_tt [ FAILED ] Hgemm_128x128x16.hgemm_2x2x2_nt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x8_nt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_nt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x17_nt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x64_nt [ FAILED ] Hgemm_128x128x16.hgemm_256x128x16_nt [ FAILED ] Hgemm_128x128x16.hgemm_128x256x16_nt [ FAILED ] Hgemm_128x128x16.hgemm_256x256x16_nt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_nn [ FAILED ] Hgemm_128x128x16.hgemm_128x128x18_nn [ FAILED ] Hgemm_128x128x16.hgemm_128x128x64_nn [ FAILED ] Hgemm_128x128x16.hgemm_256x128x16_nn [ FAILED ] Hgemm_128x128x16.hgemm_128x256x16_nn [ FAILED ] Hgemm_128x128x16.hgemm_256x256x16_nn [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_tn [ FAILED ] Hgemm_128x128x16.hgemm_128x128x18_tn [ FAILED ] Hgemm_128x128x16.hgemm_128x128x64_tn [ FAILED ] Hgemm_128x128x16.hgemm_256x128x16_tn [ FAILED ] Hgemm_128x128x16.hgemm_128x256x16_tn [ FAILED ] Hgemm_128x128x16.hgemm_256x256x16_tn [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_tt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x18_tt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x64_tt [ FAILED ] Hgemm_128x128x16.hgemm_256x128x16_tt [ FAILED ] Hgemm_128x128x16.hgemm_128x256x16_tt [ FAILED ] Hgemm_128x128x16.hgemm_256x256x16_tt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_alpha2_nt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_beta1_nt [ FAILED ] Hgemm_128x128x16.hgemm_128x128x16_alpha2_beta1_nt [ FAILED ] Hgemm_128x128x16.hgemm_508x252x120_ragged_nt [ FAILED ] Hgemm_128x128x16.hgemm_124x126x32_ragged_nt [ FAILED ] Hgemm_128x128x16.hgemm_124x126x32_ragged_alpha2_beta1_nt [ FAILED ] Igemm_128x128x32.igemm_128x128x4_nt [ FAILED ] Igemm_128x128x32.igemm_128x128x32_nt [ FAILED ] Igemm_128x128x32.igemm_128x128x36_nt [ FAILED ] Igemm_128x128x32.igemm_128x128x64_nt [ FAILED ] Igemm_128x128x32.igemm_128x128x256_nt [ FAILED ] Igemm_128x128x32.igemm_256x128x64_nt [ FAILED ] Igemm_128x128x32.igemm_128x256x64_nt [ FAILED ] Igemm_128x128x32.igemm_256x256x64_nt [ FAILED ] Igemm_128x128x32.igemm_128x128x4_nn [ FAILED ] Igemm_128x128x32.igemm_128x128x32_nn [ FAILED ] Igemm_128x128x32.igemm_128x128x36_nn [ FAILED ] Igemm_128x128x32.igemm_128x128x64_nn [ FAILED ] Igemm_128x128x32.igemm_128x128x256_nn [ FAILED ] Igemm_128x128x32.igemm_256x128x64_nn [ FAILED ] Igemm_128x128x32.igemm_128x256x64_nn [ FAILED ] Igemm_128x128x32.igemm_256x256x64_nn [ FAILED ] Igemm_128x128x32.igemm_128x128x4_tn [ FAILED ] Igemm_128x128x32.igemm_128x128x32_tn [ FAILED ] Igemm_128x128x32.igemm_128x128x36_tn [ FAILED ] Igemm_128x128x32.igemm_128x128x64_tn [ FAILED ] Igemm_128x128x32.igemm_128x128x256_tn [ FAILED ] Igemm_128x128x32.igemm_256x128x64_tn [ FAILED ] Igemm_128x128x32.igemm_128x256x64_tn [ FAILED ] Igemm_128x128x32.igemm_256x256x64_tn [ FAILED ] Igemm_128x128x32.igemm_128x128x4_tt [ FAILED ] Igemm_128x128x32.igemm_128x128x32_tt [ FAILED ] Igemm_128x128x32.igemm_128x128x36_tt [ FAILED ] Igemm_128x128x32.igemm_128x128x64_tt [ FAILED ] Igemm_128x128x32.igemm_128x128x256_tt [ FAILED ] Igemm_128x128x32.igemm_256x128x64_tt [ FAILED ] Igemm_128x128x32.igemm_128x256x64_tt [ FAILED ] Igemm_128x128x32.igemm_256x256x64_tt [ FAILED ] Igemm_128x64x32.Igemm_128x64x4_nt [ FAILED ] Igemm_128x64x32.igemm_128x64x32_nt [ FAILED ] Igemm_128x64x32.igemm_128x64x36_nt [ FAILED ] Igemm_128x64x32.igemm_128x64x64_nt [ FAILED ] Igemm_128x64x32.igemm_128x64x256_nt [ FAILED ] Igemm_128x64x32.igemm_256x64x64_nt [ FAILED ] Igemm_128x64x32.igemm_128x128x64_nt [ FAILED ] Igemm_128x64x32.igemm_256x128x64_nt [ FAILED ] Igemm_128x64x32.igemm_128x64x4_nn [ FAILED ] Igemm_128x64x32.igemm_128x64x32_nn [ FAILED ] Igemm_128x64x32.igemm_128x64x36_nn [ FAILED ] Igemm_128x64x32.igemm_128x64x64_nn [ FAILED ] Igemm_128x64x32.igemm_128x64x256_nn [ FAILED ] Igemm_128x64x32.igemm_256x64x64_nn [ FAILED ] Igemm_128x64x32.igemm_128x128x64_nn [ FAILED ] Igemm_128x64x32.igemm_256x128x64_nn [ FAILED ] Igemm_128x64x32.igemm_128x64x4_tn [ FAILED ] Igemm_128x64x32.igemm_128x64x32_tn [ FAILED ] Igemm_128x64x32.igemm_128x64x36_tn [ FAILED ] Igemm_128x64x32.igemm_128x64x64_tn [ FAILED ] Igemm_128x64x32.igemm_128x64x256_tn [ FAILED ] Igemm_128x64x32.igemm_256x64x64_tn [ FAILED ] Igemm_128x64x32.igemm_128x128x64_tn [ FAILED ] Igemm_128x64x32.igemm_256x128x64_tn [ FAILED ] Igemm_128x64x32.igemm_128x64x4_tt [ FAILED ] Igemm_128x64x32.igemm_128x64x32_tt [ FAILED ] Igemm_128x64x32.igemm_128x64x36_tt [ FAILED ] Igemm_128x64x32.igemm_128x64x64_tt [ FAILED ] Igemm_128x64x32.igemm_128x64x256_tt [ FAILED ] Igemm_128x64x32.igemm_256x64x64_tt [ FAILED ] Igemm_128x64x32.igemm_128x128x64_tt [ FAILED ] Igemm_128x64x32.igemm_256x128x64_tt [ FAILED ] Igemm_128x32x32.igemm_128x32x32x4_nt [ FAILED ] Igemm_128x32x32.igemm_128x32x32_nt [ FAILED ] Igemm_128x32x32.igemm_128x32x36_nt [ FAILED ] Igemm_128x32x32.igemm_128x32x64_nt [ FAILED ] Igemm_128x32x32.igemm_128x32x256_nt [ FAILED ] Igemm_128x32x32.igemm_256x32x64_nt [ FAILED ] Igemm_128x32x32.igemm_128x128x32_nt [ FAILED ] Igemm_128x32x32.igemm_256x128x32_nt [ FAILED ] Igemm_128x32x32.igemm_128x32x4_nn [ FAILED ] Igemm_128x32x32.igemm_128x32x32_nn [ FAILED ] Igemm_128x32x32.igemm_128x32x36_nn [ FAILED ] Igemm_128x32x32.igemm_128x32x64_nn [ FAILED ] Igemm_128x32x32.igemm_128x32x256_nn [ FAILED ] Igemm_128x32x32.igemm_256x32x64_nn [ FAILED ] Igemm_128x32x32.igemm_128x128x32_nn [ FAILED ] Igemm_128x32x32.igemm_256x128x32_nn [ FAILED ] Igemm_128x32x32.igemm_128x32x4_tn [ FAILED ] Igemm_128x32x32.igemm_128x32x32_tn [ FAILED ] Igemm_128x32x32.igemm_128x32x36_tn [ FAILED ] Igemm_128x32x32.igemm_128x32x64_tn [ FAILED ] Igemm_128x32x32.igemm_128x32x256_tn [ FAILED ] Igemm_128x32x32.igemm_256x32x64_tn [ FAILED ] Igemm_128x32x32.igemm_128x128x32_tn [ FAILED ] Igemm_128x32x32.igemm_256x128x32_tn [ FAILED ] Igemm_128x32x32.igemm_128x32x4_tt [ FAILED ] Igemm_128x32x32.igemm_128x32x32_tt [ FAILED ] Igemm_128x32x32.igemm_128x32x36_tt [ FAILED ] Igemm_128x32x32.igemm_128x32x64_tt [ FAILED ] Igemm_128x32x32.igemm_128x32x256_tt [ FAILED ] Igemm_128x32x32.igemm_256x32x64_tt [ FAILED ] Igemm_128x32x32.igemm_128x128x32_tt [ FAILED ] Igemm_128x32x32.igemm_256x128x32_tt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x4_nt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x32_nt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x36_nt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x64_nt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x256_nt [ FAILED ] Igemm_128x128x32_float.igemm_256x128x64_nt [ FAILED ] Igemm_128x128x32_float.igemm_128x256x64_nt [ FAILED ] Igemm_128x128x32_float.igemm_256x256x64_nt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x4_nn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x32_nn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x36_nn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x64_nn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x256_nn [ FAILED ] Igemm_128x128x32_float.igemm_256x128x64_nn [ FAILED ] Igemm_128x128x32_float.igemm_128x256x64_nn [ FAILED ] Igemm_128x128x32_float.igemm_256x256x64_nn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x4_tn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x32_tn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x36_tn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x64_tn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x256_tn [ FAILED ] Igemm_128x128x32_float.igemm_256x128x64_tn [ FAILED ] Igemm_128x128x32_float.igemm_128x256x64_tn [ FAILED ] Igemm_128x128x32_float.igemm_256x256x64_tn [ FAILED ] Igemm_128x128x32_float.igemm_128x128x4_tt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x32_tt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x36_tt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x64_tt [ FAILED ] Igemm_128x128x32_float.igemm_128x128x256_tt [ FAILED ] Igemm_128x128x32_float.igemm_256x128x64_tt [ FAILED ] Igemm_128x128x32_float.igemm_128x256x64_tt [ FAILED ] Igemm_128x128x32_float.igemm_256x256x64_tt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x4_nt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x32_nt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x36_nt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x64_nt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x256_nt [ FAILED ] Igemm_128x128x32_int8.igemm_256x128x64_nt [ FAILED ] Igemm_128x128x32_int8.igemm_128x256x64_nt [ FAILED ] Igemm_128x128x32_int8.igemm_256x256x64_nt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x4_nn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x32_nn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x36_nn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x64_nn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x256_nn [ FAILED ] Igemm_128x128x32_int8.igemm_256x128x64_nn [ FAILED ] Igemm_128x128x32_int8.igemm_128x256x64_nn [ FAILED ] Igemm_128x128x32_int8.igemm_256x256x64_nn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x4_tn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x32_tn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x36_tn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x64_tn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x256_tn [ FAILED ] Igemm_128x128x32_int8.igemm_256x128x64_tn [ FAILED ] Igemm_128x128x32_int8.igemm_128x256x64_tn [ FAILED ] Igemm_128x128x32_int8.igemm_256x256x64_tn [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x4_tt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x32_tt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x36_tt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x64_tt [ FAILED ] Igemm_128x128x32_int8.igemm_128x128x256_tt [ FAILED ] Igemm_128x128x32_int8.igemm_256x128x64_tt [ FAILED ] Igemm_128x128x32_int8.igemm_128x256x64_tt [ FAILED ] Igemm_128x128x32_int8.igemm_256x256x64_tt [ FAILED ] Igemm_32x32x128.igemm_32x32x4_nt [ FAILED ] Igemm_32x32x128.igemm_32x32x8_nt [ FAILED ] Igemm_32x32x128.igemm_32x32x32_nt [ FAILED ] Igemm_32x32x128.igemm_32x32x128_nt [ FAILED ] Igemm_32x32x128.igemm_32x32x4_nn [ FAILED ] Igemm_32x32x128.igemm_32x32x8_nn [ FAILED ] Igemm_32x32x128.igemm_32x32x32_nn [ FAILED ] Igemm_32x32x128.igemm_32x32x128_nn [ FAILED ] Igemm_32x32x128.igemm_32x32x4_tn [ FAILED ] Igemm_32x32x128.igemm_32x32x8_tn [ FAILED ] Igemm_32x32x128.igemm_32x32x15_tn [ FAILED ] Igemm_32x32x128.igemm_32x32x32_tn [ FAILED ] Igemm_32x32x128.igemm_32x32x128_tn [ FAILED ] Igemm_32x32x128.igemm_32x32x8_tt [ FAILED ] Igemm_32x32x128.igemm_32x32x32_tt [ FAILED ] Igemm_32x32x128.igemm_32x32x128_tt [ FAILED ] Sgemm_128x128x8.sgemm_120x112x64_ldg4_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x81x1_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x112x17_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x73x16_nt [ FAILED ] Sgemm_128x128x16.sgemm_97x112x64_nt [ FAILED ] Sgemm_128x128x16.sgemm_256x112x16_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x240x16_nt [ FAILED ] Sgemm_128x128x16.sgemm_256x240x16_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_nn [ FAILED ] Sgemm_128x128x16.sgemm_128x112x1_nn [ FAILED ] Sgemm_128x128x16.sgemm_79x112x16_nn [ FAILED ] Sgemm_128x128x16.sgemm_128x81x17_nn [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_nn [ FAILED ] Sgemm_128x128x16.sgemm_128x73x64_nn [ FAILED ] Sgemm_128x128x16.sgemm_256x112x16_nn [ FAILED ] Sgemm_128x128x16.sgemm_128x256x16_nn [ FAILED ] Sgemm_128x128x16.sgemm_256x256x16_nn [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_tn [ FAILED ] Sgemm_128x128x16.sgemm_128x128x1_tn [ FAILED ] Sgemm_128x128x16.sgemm_127x112x16_tn [ FAILED ] Sgemm_128x128x16.sgemm_21x112x17_tn [ FAILED ] Sgemm_128x128x16.sgemm_128x73x16_tn [ FAILED ] Sgemm_128x128x16.sgemm_128x81x64_tn [ FAILED ] Sgemm_128x128x16.sgemm_256x112x16_tn [ FAILED ] Sgemm_128x128x16.sgemm_47x256x16_tn [ FAILED ] Sgemm_128x128x16.sgemm_211x256x16_tn [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_tt [ FAILED ] Sgemm_128x128x16.sgemm_128x128x1_tt [ FAILED ] Sgemm_128x128x16.sgemm_109x112x16_tt [ FAILED ] Sgemm_128x128x16.sgemm_128x112x17_tt [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_tt [ FAILED ] Sgemm_128x128x16.sgemm_123x112x64_tt [ FAILED ] Sgemm_128x128x16.sgemm_256x112x16_tt [ FAILED ] Sgemm_128x128x16.sgemm_128x256x16_tt [ FAILED ] Sgemm_128x128x16.sgemm_256x256x16_tt [ FAILED ] Sgemm_128x128x16.sgemm_120x112x64_ldg4_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x128x16_alpha2_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_beta1_nt [ FAILED ] Sgemm_128x128x16.sgemm_128x112x16_alpha2_beta1_nt [ FAILED ] Sgemm_128x64x16.sgemm_128x64x1_nt [ FAILED ] Sgemm_128x64x16.sgemm_128x64x16_nt [ FAILED ] Sgemm_128x64x16.sgemm_128x64x17_nt [ FAILED ] Sgemm_128x64x16.sgemm_128x64x64_nt [ FAILED ] Sgemm_128x64x16.sgemm_256x64x16_nt [ FAILED ] Sgemm_128x64x16.sgemm_128x128x16_nt [ FAILED ] Sgemm_128x64x16.sgemm_256x128x16_nt [ FAILED ] Sgemm_128x64x16.sgemm_128x64x1_tn [ FAILED ] Sgemm_128x64x16.sgemm_128x64x16_tn [ FAILED ] Sgemm_128x64x16.sgemm_128x64x17_tn [ FAILED ] Sgemm_128x64x16.sgemm_128x64x64_tn [ FAILED ] Sgemm_128x64x16.sgemm_256x64x16_tn [ FAILED ] Sgemm_128x64x16.sgemm_128x128x16_tn [ FAILED ] Sgemm_128x64x16.sgemm_256x128x16_tn [ FAILED ] Sgemm_128x64x16.sgemm_128x64x1_tt [ FAILED ] Sgemm_128x64x16.sgemm_128x64x16_tt [ FAILED ] Sgemm_128x64x16.sgemm_128x64x17_tt [ FAILED ] Sgemm_128x64x16.sgemm_128x64x64_tt [ FAILED ] Sgemm_128x64x16.sgemm_128x128x16_tt [ FAILED ] Sgemm_128x64x16.sgemm_256x128x16_tt [ FAILED ] Sgemm_128x32x16.sgemm_128x32x1_tn [ FAILED ] Sgemm_128x32x16.sgemm_128x32x1_tt [ FAILED ] Sgemm_64x128x16.sgemm_64x128x64_4x8_accumulators_nt

280 FAILED TESTS

dongxiao92 commented 6 years ago

I encountered the same problem on TX2

wtiandong commented 6 years ago

Hmm... I come to answer my own question. I did some other work these days and didn't focus on cutlass. Today I just think that maybe I need to check this issue... Just add: cudaError_t result = cudaGetLastError(); ASSERT_EQ(result, cudaSuccess) << "\nCUDA kernel launch error: " << cudaGetErrorString(result)<< "\n"; After Gemm::launch(params); in run_gemm function implementation in "cutlass/tools/test/unit/gemm/gemm.h", and then the error will come out. "CUDA kernel launch error: too many resources requested for launch"

Hmm, the reason why it failed is that there're not enough resources on TX2. Actually I tested on Titan X and got same but less error due to lack of resources. So I suggest to put the error check code to the test code.

BTW, why there is few discussion issues in this section of a project with hundreds of stars?

dongxiao92 commented 6 years ago

@wtiandong do you consider providing a PR?

dongxiao92 commented 6 years ago

I change the code as your advice. But my error is as follows. Expected equality of these values: result Which is: 48 cudaSuccess Which is: 0 CUDA Kernel launch error: no kernel image is available for execution on the device

dongxiao92 commented 6 years ago

sorry to replay this closed issue many time. By setting CUTLASS_NVCC_ARCH=62, the above problem is solved.

wtiandong commented 6 years ago

Hi dongxiao, Sounds great. But unfortunately, I find that cutlass is much slower than cublas. cublas takes 70% of time that cutlass takes. I think that cutlass needs more work to make it faster.