lukeiwanski / tensorflow

OpenCL support for TensorFlow via SYCL
Apache License 2.0
65 stars 14 forks source link

FAIL: //tensorflow/core:common_runtime_direct_session_with_tracking_alloc_test #75

Closed lukeiwanski closed 7 years ago

lukeiwanski commented 7 years ago

System Info

  Name:                      Hawaii
  Vendor:                    Advanced Micro Devices, Inc.
  Device OpenCL C version:           OpenCL C 2.0 
  Driver version:                1912.5 (VM)
  Profile:                   FULL_PROFILE
  Version:                   OpenCL 2.0 AMD-APP (1912.5)
  Extensions:                    cl_khr_fp64 cl_amd_fp64 cl_khr_global_int32_base_atomics cl_khr_global_int32_extended_atomics cl_khr_local_int32_base_atomics cl_khr_local_int32_extended_atomics cl_khr_int64_base_atomics cl_khr_int64_extended_atomics cl_khr_3d_image_writes cl_khr_byte_addressable_store cl_khr_gl_sharing cl_khr_gl_depth_images cl_ext_atomic_counters_32 cl_amd_device_attribute_query cl_amd_vec3 cl_amd_printf cl_amd_media_ops cl_amd_media_ops2 cl_amd_popcnt cl_khr_image2d_from_buffer cl_khr_spir cl_khr_subgroups cl_khr_gl_event cl_khr_depth_images cl_khr_mipmap_image cl_khr_mipmap_image_writes

ComputeCpp 0.2.0

To reproduce

bazel test --config=sycl --local_test_jobs=4 -k --test_lang_filters=cc,py --action_env=LD_PRELOAD=/usr/lib/libOpenCL.so.1 --test_timeout 300,750,1200,3600 //tensorflow/core:common_runtime_direct_session_with_tracking_alloc_test

Error

[ RUN      ] DirectSessionWithTrackingAllocTest.CostModelWarmup
2017-06-12 15:57:58.363361: F ./tensorflow/core/common_runtime/direct_session_with_tracking_alloc_test.cc:141] Check failed: cost_models.size() == 1 (2 vs. 1)
external/bazel_tools/tools/test/test-setup.sh: line 159: 23310 Aborted                 (core dumped) "${TEST_PATH}" "$@"
lukeiwanski commented 7 years ago

That seems to be failing for CUDA as well:

2017-06-15 15:17:13.127249: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-15 15:17:13.162545: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-06-15 15:17:13.162550: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-06-15 15:17:13.300625: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:893] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2017-06-15 15:17:13.300933: I tensorflow/core/common_runtime/gpu/gpu_device.cc:938] Found device 0 with properties: 
name: GeForce GTX 980
major: 5 minor: 2 memoryClockRate (GHz) 1.2785
pciBusID 0000:01:00.0
Total memory: 3.94GiB
Free memory: 3.87GiB
2017-06-15 15:17:13.300946: I tensorflow/core/common_runtime/gpu/gpu_device.cc:959] DMA: 0 
2017-06-15 15:17:13.300950: I tensorflow/core/common_runtime/gpu/gpu_device.cc:969] 0:   Y 
2017-06-15 15:17:13.300960: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1028] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 980, pci bus id: 0000:01:00.0)
[       OK ] DirectSessionWithTrackingAllocTest.CostModelTest (204 ms)
[ RUN      ] DirectSessionWithTrackingAllocTest.CostModelWarmup
2017-06-15 15:17:13.320060: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1028] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 980, pci bus id: 0000:01:00.0)
2017-06-15 15:17:13.324450: F tensorflow/core/common_runtime/direct_session_with_tracking_alloc_test.cc:141] Check failed: cost_models.size() == 1 (2 vs. 1)
external/bazel_tools/tools/test/test-setup.sh: line 159:   333 Aborted                 (core dumped) "${TEST_PATH}" "$@"

Reducing priority.

jwlawson commented 7 years ago

Fixed by #114 and #115

jwlawson commented 7 years ago

This test is marked no_gpu, so we shouldn't be running it on SYCL anyway. The test behaviour changed in #115 may actually be intended, so the test fails when run with any devices other than CPU.

lukeiwanski commented 7 years ago

Closing