hughperkins / tf-coriander

OpenCL 1.2 implementation for Tensorflow
Apache License 2.0
791 stars 90 forks source link

crash dynamic_rnn.py in tensorflow-cl #53

Open dal2 opened 7 years ago

dal2 commented 7 years ago

https://github.com/hughperkins/TensorFlow-Examples/blob/as-unit-tests/examples/3_NeuralNetworks/dynamic_rnn.py

/Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6 /Users/user/PycharmProjects/tensorflow-test/cnn.py
/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/site-packages/tensorflow/python/ops/gradients.py:90: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
OpenCL platform: Apple
OpenCL device: Intel(R) Iris(TM) Graphics 6100
I tensorflow/core/common_runtime/gpu/gpu_device.cc:989] Found device 0 with properties: 
name: Intel(R) Iris(TM) Graphics 6100
major: -1 minor: -1 memoryClockRate (GHz) 1050
pciBusID 0000.0000
Total memory: 1.50GiB
Free memory: 384.00MiB
I tensorflow/core/common_runtime/gpu/gpu_device.cc:877] cannot enable peer access from device ordinal 0 to device ordinal 0
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1011] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] 0:   N 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:1083] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Intel(R) Iris(TM) Graphics 6100, pci bus id: 0000.0000)
cl_driver DeviceAllocate 192937984
__internal__ build log: 
<program source>:31:36: warning: unused variable 'pGlobalVars'
    const struct GlobalVars* const pGlobalVars = &globalVars;
                                   ^
Cannot select: 0x7fed088a1310: i32 = any_extend 0x7fed08909010 [ID=43]
  0x7fed08909010: i32 = IGILISD::IGILSETCC 0x7fed088a1b10, 0x7fed088a1510, 0x7fed0890d010 [ID=42]
    0x7fed088a1b10: i64 = bitcast 0x7fed08918b10 [ID=41]
      0x7fed08918b10: v2i32 = IGILISD::MOVSWZ 0x7fed0890cf10, 0x7fed088a1e10, 0x7fed08918510, 0x7fed08918510 [ID=38]
        0x7fed0890cf10: i32,ch = load 0x7fed09a09970, 0x7fed0890c410, 0x7fed08908810<LD4[%28]> [ORD=24] [ID=34]
          0x7fed0890c410: i64 = add 0x7fed0890c210, 0x7fed0890d710 [ORD=23] [ID=33]
            0x7fed0890c210: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890d810 [ORD=19] [ID=17]
              0x7fed0890d810: i64 = Register %vreg1 [ORD=19] [ID=1]
            0x7fed0890d710: i64 = shl 0x7fed08908a10, 0x7fed088a1710 [ORD=23] [ID=32]
              0x7fed08908a10: i64 = bitcast 0x7fed088a1f10 [ID=31]
                0x7fed088a1f10: v2i32 = IGILISD::MOVSWZ 0x7fed0890c110, 0x7fed0890c610, 0x7fed08918510, 0x7fed08918510 [ID=30]
                  0x7fed0890c110: i32,i32 = sdivrem 0x7fed08919010, 0x7fed0890ce10 [ID=27]

                  0x7fed0890c610: i32 = sra 0x7fed0890c110, 0x7fed08918910 [ID=29]

                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed088a1710: i64 = bitcast 0x7fed088a1810 [ID=26]
                0x7fed088a1810: v2i32 = IGILISD::MOVSWZ 0x7fed0890a410, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=23]
                  0x7fed0890a410: i32 = Constant<2> [ID=16]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
          0x7fed08908810: i64 = bitcast 0x7fed0890a210 [ID=25]
            0x7fed0890a210: v2i32 = IGILISD::MOVSWZ 0x7fed08918510, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=21]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
        0x7fed088a1e10: i32 = sra 0x7fed0890cf10, 0x7fed08918910 [ID=35]
          0x7fed0890cf10: i32,ch = load 0x7fed09a09970, 0x7fed0890c410, 0x7fed08908810<LD4[%28]> [ORD=24] [ID=34]
            0x7fed0890c410: i64 = add 0x7fed0890c210, 0x7fed0890d710 [ORD=23] [ID=33]
              0x7fed0890c210: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890d810 [ORD=19] [ID=17]
                0x7fed0890d810: i64 = Register %vreg1 [ORD=19] [ID=1]
              0x7fed0890d710: i64 = shl 0x7fed08908a10, 0x7fed088a1710 [ORD=23] [ID=32]
                0x7fed08908a10: i64 = bitcast 0x7fed088a1f10 [ID=31]
                  0x7fed088a1f10: v2i32 = IGILISD::MOVSWZ 0x7fed0890c110, 0x7fed0890c610, 0x7fed08918510, 0x7fed08918510 [ID=30]

                0x7fed088a1710: i64 = bitcast 0x7fed088a1810 [ID=26]
                  0x7fed088a1810: v2i32 = IGILISD::MOVSWZ 0x7fed0890a410, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=23]

            0x7fed08908810: i64 = bitcast 0x7fed0890a210 [ID=25]
              0x7fed0890a210: v2i32 = IGILISD::MOVSWZ 0x7fed08918510, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=21]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
          0x7fed08918910: i32 = Constant<31> [ID=15]
        0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
        0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
    0x7fed088a1510: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890cb10 [ORD=28] [ID=20]
      0x7fed0890cb10: i64 = Register %vreg60 [ORD=28] [ID=7]
In function: _ZN10tensorflow14Gat
kernel build error:
Something went wrong with clCreateKernel, OpenCL error code -45
__internal__ build log: 
<program source>:31:36: warning: unused variable 'pGlobalVars'
    const struct GlobalVars* const pGlobalVars = &globalVars;
                                   ^
Cannot select: 0x7fed088a1310: i32 = any_extend 0x7fed08909010 [ID=43]
  0x7fed08909010: i32 = IGILISD::IGILSETCC 0x7fed088a1b10, 0x7fed088a1510, 0x7fed0890d010 [ID=42]
    0x7fed088a1b10: i64 = bitcast 0x7fed08918b10 [ID=41]
      0x7fed08918b10: v2i32 = IGILISD::MOVSWZ 0x7fed0890cf10, 0x7fed088a1e10, 0x7fed08918510, 0x7fed08918510 [ID=38]
        0x7fed0890cf10: i32,ch = load 0x7fed09a09970, 0x7fed0890c410, 0x7fed08908810<LD4[%28]> [ORD=24] [ID=34]
          0x7fed0890c410: i64 = add 0x7fed0890c210, 0x7fed0890d710 [ORD=23] [ID=33]
            0x7fed0890c210: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890d810 [ORD=19] [ID=17]
              0x7fed0890d810: i64 = Register %vreg1 [ORD=19] [ID=1]
            0x7fed0890d710: i64 = shl 0x7fed08908a10, 0x7fed088a1710 [ORD=23] [ID=32]
              0x7fed08908a10: i64 = bitcast 0x7fed088a1f10 [ID=31]
                0x7fed088a1f10: v2i32 = IGILISD::MOVSWZ 0x7fed0890c110, 0x7fed0890c610, 0x7fed08918510, 0x7fed08918510 [ID=30]
                  0x7fed0890c110: i32,i32 = sdivrem 0x7fed08919010, 0x7fed0890ce10 [ID=27]

                  0x7fed0890c610: i32 = sra 0x7fed0890c110, 0x7fed08918910 [ID=29]

                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed088a1710: i64 = bitcast 0x7fed088a1810 [ID=26]
                0x7fed088a1810: v2i32 = IGILISD::MOVSWZ 0x7fed0890a410, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=23]
                  0x7fed0890a410: i32 = Constant<2> [ID=16]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
          0x7fed08908810: i64 = bitcast 0x7fed0890a210 [ID=25]
            0x7fed0890a210: v2i32 = IGILISD::MOVSWZ 0x7fed08918510, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=21]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
        0x7fed088a1e10: i32 = sra 0x7fed0890cf10, 0x7fed08918910 [ID=35]
          0x7fed0890cf10: i32,ch = load 0x7fed09a09970, 0x7fed0890c410, 0x7fed08908810<LD4[%28]> [ORD=24] [ID=34]
            0x7fed0890c410: i64 = add 0x7fed0890c210, 0x7fed0890d710 [ORD=23] [ID=33]
              0x7fed0890c210: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890d810 [ORD=19] [ID=17]
                0x7fed0890d810: i64 = Register %vreg1 [ORD=19] [ID=1]
              0x7fed0890d710: i64 = shl 0x7fed08908a10, 0x7fed088a1710 [ORD=23] [ID=32]
                0x7fed08908a10: i64 = bitcast 0x7fed088a1f10 [ID=31]
                  0x7fed088a1f10: v2i32 = IGILISD::MOVSWZ 0x7fed0890c110, 0x7fed0890c610, 0x7fed08918510, 0x7fed08918510 [ID=30]

                0x7fed088a1710: i64 = bitcast 0x7fed088a1810 [ID=26]
                  0x7fed088a1810: v2i32 = IGILISD::MOVSWZ 0x7fed0890a410, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=23]

            0x7fed08908810: i64 = bitcast 0x7fed0890a210 [ID=25]
              0x7fed0890a210: v2i32 = IGILISD::MOVSWZ 0x7fed08918510, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=21]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
          0x7fed08918910: i32 = Constant<31> [ID=15]
        0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
        0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
    0x7fed088a1510: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890cb10 [ORD=28] [ID=20]
      0x7fed0890cb10: i64 = Register %vreg60 [ORD=28] [ID=7]
In function: _ZN10tensorflow14Gat
storing failed kernel into: easycl-failedkernel.cl
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Something went wrong with clCreateKernel, OpenCL error code -45
__internal__ build log: 
<program source>:31:36: warning: unused variable 'pGlobalVars'
    const struct GlobalVars* const pGlobalVars = &globalVars;
                                   ^
Cannot select: 0x7fed088a1310: i32 = any_extend 0x7fed08909010 [ID=43]
  0x7fed08909010: i32 = IGILISD::IGILSETCC 0x7fed088a1b10, 0x7fed088a1510, 0x7fed0890d010 [ID=42]
    0x7fed088a1b10: i64 = bitcast 0x7fed08918b10 [ID=41]
      0x7fed08918b10: v2i32 = IGILISD::MOVSWZ 0x7fed0890cf10, 0x7fed088a1e10, 0x7fed08918510, 0x7fed08918510 [ID=38]
        0x7fed0890cf10: i32,ch = load 0x7fed09a09970, 0x7fed0890c410, 0x7fed08908810<LD4[%28]> [ORD=24] [ID=34]
          0x7fed0890c410: i64 = add 0x7fed0890c210, 0x7fed0890d710 [ORD=23] [ID=33]
            0x7fed0890c210: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890d810 [ORD=19] [ID=17]
              0x7fed0890d810: i64 = Register %vreg1 [ORD=19] [ID=1]
            0x7fed0890d710: i64 = shl 0x7fed08908a10, 0x7fed088a1710 [ORD=23] [ID=32]
              0x7fed08908a10: i64 = bitcast 0x7fed088a1f10 [ID=31]
                0x7fed088a1f10: v2i32 = IGILISD::MOVSWZ 0x7fed0890c110, 0x7fed0890c610, 0x7fed08918510, 0x7fed08918510 [ID=30]
                  0x7fed0890c110: i32,i32 = sdivrem 0x7fed08919010, 0x7fed0890ce10 [ID=27]

                  0x7fed0890c610: i32 = sra 0x7fed0890c110, 0x7fed08918910 [ID=29]

                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed088a1710: i64 = bitcast 0x7fed088a1810 [ID=26]
                0x7fed088a1810: v2i32 = IGILISD::MOVSWZ 0x7fed0890a410, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=23]
                  0x7fed0890a410: i32 = Constant<2> [ID=16]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                  0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
          0x7fed08908810: i64 = bitcast 0x7fed0890a210 [ID=25]
            0x7fed0890a210: v2i32 = IGILISD::MOVSWZ 0x7fed08918510, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=21]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
              0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
        0x7fed088a1e10: i32 = sra 0x7fed0890cf10, 0x7fed08918910 [ID=35]
          0x7fed0890cf10: i32,ch = load 0x7fed09a09970, 0x7fed0890c410, 0x7fed08908810<LD4[%28]> [ORD=24] [ID=34]
            0x7fed0890c410: i64 = add 0x7fed0890c210, 0x7fed0890d710 [ORD=23] [ID=33]
              0x7fed0890c210: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890d810 [ORD=19] [ID=17]
                0x7fed0890d810: i64 = Register %vreg1 [ORD=19] [ID=1]
              0x7fed0890d710: i64 = shl 0x7fed08908a10, 0x7fed088a1710 [ORD=23] [ID=32]
                0x7fed08908a10: i64 = bitcast 0x7fed088a1f10 [ID=31]
                  0x7fed088a1f10: v2i32 = IGILISD::MOVSWZ 0x7fed0890c110, 0x7fed0890c610, 0x7fed08918510, 0x7fed08918510 [ID=30]

                0x7fed088a1710: i64 = bitcast 0x7fed088a1810 [ID=26]
                  0x7fed088a1810: v2i32 = IGILISD::MOVSWZ 0x7fed0890a410, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=23]

            0x7fed08908810: i64 = bitcast 0x7fed0890a210 [ID=25]
              0x7fed0890a210: v2i32 = IGILISD::MOVSWZ 0x7fed08918510, 0x7fed08918510, 0x7fed08918510, 0x7fed08918510 [ID=21]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
                0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
          0x7fed08918910: i32 = Constant<31> [ID=15]
        0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
        0x7fed08918510: i32 = Constant<0> [ORD=31] [ID=9]
    0x7fed088a1510: i64,ch = CopyFromReg 0x7fed09a09970, 0x7fed0890cb10 [ORD=28] [ID=20]
      0x7fed0890cb10: i64 = Register %vreg60 [ORD=28] [ID=7]
In function: _ZN10tensorflow14Gatstoring failed kernel into: easycl-failedkernel.cl

Process finished with exit code 134 (interrupted by signal 6: SIGABRT)
hughperkins commented 7 years ago

wow, ok. Well... there are various possible ways forward, but I reckon the easiest might be to proceed by commenting out much of the kernel, then uncommenting bit by bit, until we find the line that crashes it.

The general principle is as follows:

Then, gradually uncomment stuff, and try to find out which line causes the crash above.

It might take a few hours to work through this process. There's an example of my following this process ofr a bug in clBLAS, here https://github.com/clMathLibraries/clBLAS/issues/108