hughperkins / tf-coriander

OpenCL 1.2 implementation for Tensorflow
Apache License 2.0
789 stars 94 forks source link

tf-coriander/tensorflow/examples/tutorials/word2vec fail #70

Closed elife33 closed 6 years ago

elife33 commented 6 years ago

I have install the pre-built wheels in my MacBook Pro(10.12.6) and when I tried python word2vec_basic.py in tf-coriander/tensorflow/examples/tutorials/word2vec, I got below error:

(tensorflow.cl) QiangdeMacBook-Pro:word2vec elife$ python word2vec_basic.py Found and verified text8.zip Data size 17005207 Most common words (+UNK) [['UNK', 418391], ('the', 1061396), ('of', 593677), ('and', 416629), ('one', 411764)] Sample data [5234, 3081, 12, 6, 195, 2, 3134, 46, 59, 156] ['anarchism', 'originated', 'as', 'a', 'term', 'of', 'abuse', 'first', 'used', 'against'] 3081 originated -> 5234 anarchism 3081 originated -> 12 as 12 as -> 6 a 12 as -> 3081 originated 6 a -> 195 term 6 a -> 12 as 195 term -> 6 a 195 term -> 2 of OpenCL platform: Apple OpenCL device: Iris Pro I tensorflow/core/common_runtime/gpu/gpu_device.cc:989] Found device 0 with properties: name: Iris Pro major: -1 minor: -1 memoryClockRate (GHz) 1200 pciBusID 0000.0000 Total memory: 1.50GiB Free memory: 384.00MiB W tensorflow/stream_executor/cl/cl_driver.cc:587] creating context when one is currently active; existing: `?̚? OpenCL platform: Apple OpenCL device: AMD Radeon R9 M370X Compute Engine I tensorflow/core/common_runtime/gpu/gpu_device.cc:989] Found device 1 with properties: name: AMD Radeon R9 M370X Compute Engine major: -1 minor: -1 memoryClockRate (GHz) 800 pciBusID 0000.0000 Total memory: 1.50GiB Free memory: 384.00MiB I tensorflow/core/common_runtime/gpu/gpu_device.cc:877] cannot enable peer access from device ordinal 0 to device ordinal 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:877] cannot enable peer access from device ordinal 0 to device ordinal 1 I tensorflow/core/common_runtime/gpu/gpu_device.cc:877] cannot enable peer access from device ordinal 1 to device ordinal 0 I tensorflow/core/common_runtime/gpu/gpu_device.cc:877] cannot enable peer access from device ordinal 1 to device ordinal 1 I tensorflow/core/common_runtime/gpu/gpu_device.cc:1011] DMA: 0 1 I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] 0: N N I tensorflow/core/common_runtime/gpu/gpu_device.cc:1021] 1: N N I tensorflow/core/common_runtime/gpu/gpu_device.cc:1083] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Iris Pro, pci bus id: 0000.0000) I tensorflow/core/common_runtime/gpu/gpu_device.cc:1083] Creating TensorFlow device (/gpu:1) -> (device: 1, name: AMD Radeon R9 M370X Compute Engine, pci bus id: 0000.0000) cl_driver DeviceAllocate 192937984 cl_driver DeviceAllocate 192937984 Initialized Average loss at step 0 : 249.100921631 internal build log:

:31:36: warning: unused variable 'pGlobalVars' const struct GlobalVars* const pGlobalVars = &globalVars; ^ Cannot select: 0x7f8f3502fd10: i32 = any_extend 0x7f8f35031e10 [ID=45] 0x7f8f35031e10: i32 = IGILISD::IGILSETCC 0x7f8f35031c10, 0x7f8f3502ff10, 0x7f8f35035e10 [ID=44] 0x7f8f35031c10: i64 = bitcast 0x7f8f35031610 [ID=43] 0x7f8f35031610: v2i32 = IGILISD::MOVSWZ 0x7f8f35035d10, 0x7f8f35030810, 0x7f8f35044b10, 0x7f8f35044b10 [ID=40] 0x7f8f35035d10: i32,ch = load 0x7f8f34f31570, 0x7f8f35035210, 0x7f8f35036010 [ORD=24] [ID=36] 0x7f8f35035210: i64 = add 0x7f8f35035010, 0x7f8f35045110 [ORD=23] [ID=35] 0x7f8f35035010: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35036610 [ORD=19] [ID=18] 0x7f8f35036610: i64 = Register %vreg1 [ORD=19] [ID=1] 0x7f8f35045110: i64 = bitcast 0x7f8f35030910 [ID=34] 0x7f8f35030910: v2i32 = IGILISD::MOVSWZ 0x7f8f35030210, 0x7f8f35030610, 0x7f8f35044b10, 0x7f8f35044b10 [ID=33] 0x7f8f35030210: i32 = shl 0x7f8f35034f10, 0x7f8f35033210 [ID=29] 0x7f8f35034f10: i32,i32 = sdivrem 0x7f8f35045610, 0x7f8f35035c10 [ID=26] 0x7f8f35033210: i32 = Constant<2> [ID=16] 0x7f8f35030610: i32 = or 0x7f8f35031810, 0x7f8f35030510 [ID=32] 0x7f8f35031810: i32 = shl 0x7f8f35035410, 0x7f8f35033210 [ID=31] 0x7f8f35030510: i32 = srl 0x7f8f35034f10, 0x7f8f35030110 [ID=30] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35036010: i64 = bitcast 0x7f8f35036510 [ID=25] 0x7f8f35036510: v2i32 = IGILISD::MOVSWZ 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10 [ID=22] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35030810: i32 = sra 0x7f8f35035d10, 0x7f8f35044f10 [ID=37] 0x7f8f35035d10: i32,ch = load 0x7f8f34f31570, 0x7f8f35035210, 0x7f8f35036010 [ORD=24] [ID=36] 0x7f8f35035210: i64 = add 0x7f8f35035010, 0x7f8f35045110 [ORD=23] [ID=35] 0x7f8f35035010: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35036610 [ORD=19] [ID=18] 0x7f8f35036610: i64 = Register %vreg1 [ORD=19] [ID=1] 0x7f8f35045110: i64 = bitcast 0x7f8f35030910 [ID=34] 0x7f8f35030910: v2i32 = IGILISD::MOVSWZ 0x7f8f35030210, 0x7f8f35030610, 0x7f8f35044b10, 0x7f8f35044b10 [ID=33] 0x7f8f35030210: i32 = shl 0x7f8f35034f10, 0x7f8f35033210 [ID=29] 0x7f8f35030610: i32 = or 0x7f8f35031810, 0x7f8f35030510 [ID=32] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35036010: i64 = bitcast 0x7f8f35036510 [ID=25] 0x7f8f35036510: v2i32 = IGILISD::MOVSWZ 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10 [ID=22] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044f10: i32 = Constant<31> [ID=15] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f3502ff10: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35035910 [ORD=28] [ID=21] 0x7f8f35035910: i64 = Register %vreg60 [ORD=28] [ID=7] In function: _ZN10tensorflow14Gat kernel build error: Something went wrong with clCreateKernel, OpenCL error code -45 __internal__ build log: :31:36: warning: unused variable 'pGlobalVars' const struct GlobalVars* const pGlobalVars = &globalVars; ^ Cannot select: 0x7f8f3502fd10: i32 = any_extend 0x7f8f35031e10 [ID=45] 0x7f8f35031e10: i32 = IGILISD::IGILSETCC 0x7f8f35031c10, 0x7f8f3502ff10, 0x7f8f35035e10 [ID=44] 0x7f8f35031c10: i64 = bitcast 0x7f8f35031610 [ID=43] 0x7f8f35031610: v2i32 = IGILISD::MOVSWZ 0x7f8f35035d10, 0x7f8f35030810, 0x7f8f35044b10, 0x7f8f35044b10 [ID=40] 0x7f8f35035d10: i32,ch = load 0x7f8f34f31570, 0x7f8f35035210, 0x7f8f35036010 [ORD=24] [ID=36] 0x7f8f35035210: i64 = add 0x7f8f35035010, 0x7f8f35045110 [ORD=23] [ID=35] 0x7f8f35035010: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35036610 [ORD=19] [ID=18] 0x7f8f35036610: i64 = Register %vreg1 [ORD=19] [ID=1] 0x7f8f35045110: i64 = bitcast 0x7f8f35030910 [ID=34] 0x7f8f35030910: v2i32 = IGILISD::MOVSWZ 0x7f8f35030210, 0x7f8f35030610, 0x7f8f35044b10, 0x7f8f35044b10 [ID=33] 0x7f8f35030210: i32 = shl 0x7f8f35034f10, 0x7f8f35033210 [ID=29] 0x7f8f35034f10: i32,i32 = sdivrem 0x7f8f35045610, 0x7f8f35035c10 [ID=26] 0x7f8f35033210: i32 = Constant<2> [ID=16] 0x7f8f35030610: i32 = or 0x7f8f35031810, 0x7f8f35030510 [ID=32] 0x7f8f35031810: i32 = shl 0x7f8f35035410, 0x7f8f35033210 [ID=31] 0x7f8f35030510: i32 = srl 0x7f8f35034f10, 0x7f8f35030110 [ID=30] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35036010: i64 = bitcast 0x7f8f35036510 [ID=25] 0x7f8f35036510: v2i32 = IGILISD::MOVSWZ 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10 [ID=22] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35030810: i32 = sra 0x7f8f35035d10, 0x7f8f35044f10 [ID=37] 0x7f8f35035d10: i32,ch = load 0x7f8f34f31570, 0x7f8f35035210, 0x7f8f35036010 [ORD=24] [ID=36] 0x7f8f35035210: i64 = add 0x7f8f35035010, 0x7f8f35045110 [ORD=23] [ID=35] 0x7f8f35035010: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35036610 [ORD=19] [ID=18] 0x7f8f35036610: i64 = Register %vreg1 [ORD=19] [ID=1] 0x7f8f35045110: i64 = bitcast 0x7f8f35030910 [ID=34] 0x7f8f35030910: v2i32 = IGILISD::MOVSWZ 0x7f8f35030210, 0x7f8f35030610, 0x7f8f35044b10, 0x7f8f35044b10 [ID=33] 0x7f8f35030210: i32 = shl 0x7f8f35034f10, 0x7f8f35033210 [ID=29] 0x7f8f35030610: i32 = or 0x7f8f35031810, 0x7f8f35030510 [ID=32] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35036010: i64 = bitcast 0x7f8f35036510 [ID=25] 0x7f8f35036510: v2i32 = IGILISD::MOVSWZ 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10 [ID=22] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044f10: i32 = Constant<31> [ID=15] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f3502ff10: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35035910 [ORD=28] [ID=21] 0x7f8f35035910: i64 = Register %vreg60 [ORD=28] [ID=7] In function: _ZN10tensorflow14Gat storing failed kernel into: easycl-failedkernel.cl libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: Something went wrong with clCreateKernel, OpenCL error code -45 __internal__ build log: :31:36: warning: unused variable 'pGlobalVars' const struct GlobalVars* const pGlobalVars = &globalVars; ^ Cannot select: 0x7f8f3502fd10: i32 = any_extend 0x7f8f35031e10 [ID=45] 0x7f8f35031e10: i32 = IGILISD::IGILSETCC 0x7f8f35031c10, 0x7f8f3502ff10, 0x7f8f35035e10 [ID=44] 0x7f8f35031c10: i64 = bitcast 0x7f8f35031610 [ID=43] 0x7f8f35031610: v2i32 = IGILISD::MOVSWZ 0x7f8f35035d10, 0x7f8f35030810, 0x7f8f35044b10, 0x7f8f35044b10 [ID=40] 0x7f8f35035d10: i32,ch = load 0x7f8f34f31570, 0x7f8f35035210, 0x7f8f35036010 [ORD=24] [ID=36] 0x7f8f35035210: i64 = add 0x7f8f35035010, 0x7f8f35045110 [ORD=23] [ID=35] 0x7f8f35035010: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35036610 [ORD=19] [ID=18] 0x7f8f35036610: i64 = Register %vreg1 [ORD=19] [ID=1] 0x7f8f35045110: i64 = bitcast 0x7f8f35030910 [ID=34] 0x7f8f35030910: v2i32 = IGILISD::MOVSWZ 0x7f8f35030210, 0x7f8f35030610, 0x7f8f35044b10, 0x7f8f35044b10 [ID=33] 0x7f8f35030210: i32 = shl 0x7f8f35034f10, 0x7f8f35033210 [ID=29] 0x7f8f35034f10: i32,i32 = sdivrem 0x7f8f35045610, 0x7f8f35035c10 [ID=26] 0x7f8f35033210: i32 = Constant<2> [ID=16] 0x7f8f35030610: i32 = or 0x7f8f35031810, 0x7f8f35030510 [ID=32] 0x7f8f35031810: i32 = shl 0x7f8f35035410, 0x7f8f35033210 [ID=31] 0x7f8f35030510: i32 = srl 0x7f8f35034f10, 0x7f8f35030110 [ID=30] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35036010: i64 = bitcast 0x7f8f35036510 [ID=25] 0x7f8f35036510: v2i32 = IGILISD::MOVSWZ 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10 [ID=22] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35030810: i32 = sra 0x7f8f35035d10, 0x7f8f35044f10 [ID=37] 0x7f8f35035d10: i32,ch = load 0x7f8f34f31570, 0x7f8f35035210, 0x7f8f35036010 [ORD=24] [ID=36] 0x7f8f35035210: i64 = add 0x7f8f35035010, 0x7f8f35045110 [ORD=23] [ID=35] 0x7f8f35035010: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35036610 [ORD=19] [ID=18] 0x7f8f35036610: i64 = Register %vreg1 [ORD=19] [ID=1] 0x7f8f35045110: i64 = bitcast 0x7f8f35030910 [ID=34] 0x7f8f35030910: v2i32 = IGILISD::MOVSWZ 0x7f8f35030210, 0x7f8f35030610, 0x7f8f35044b10, 0x7f8f35044b10 [ID=33] 0x7f8f35030210: i32 = shl 0x7f8f35034f10, 0x7f8f35033210 [ID=29] 0x7f8f35030610: i32 = or 0x7f8f35031810, 0x7f8f35030510 [ID=32] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35036010: i64 = bitcast 0x7f8f35036510 [ID=25] 0x7f8f35036510: v2i32 = IGILISD::MOVSWZ 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10, 0x7f8f35044b10 [ID=22] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044f10: i32 = Constant<31> [ID=15] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f35044b10: i32 = Constant<0> [ORD=31] [ID=9] 0x7f8f3502ff10: i64,ch = CopyFromReg 0x7f8f34f31570, 0x7f8f35035910 [ORD=28] [ID=21] 0x7f8f35035910: i64 = Register %vreg60 [ORD=28] [ID=7] In function: _ZN10tensorflow14Gatstoring failed kernel into: easycl-failedkernel.cl Abort trap: 6 (tensorflow.cl) QiangdeMacBook-Pro:word2vec elife$
elife33 commented 6 years ago

after I set export CL_GPUOFFSET=1, it works:-)

hughperkins commented 6 years ago

cool! :)