Precompiled cuda111 doesn't work with latest cuda

masahiro-999 commented 1 year ago

I got a this error message with XLA_TARGET=cuda111.

[error] cuLinkAddData fails. This is usually caused by stale driver version.

Do you have any plans to provide precompile XLA package for latest cuda like cuda11.8?

My test envirment is

OS: Ubuntu 20.04 on WSL2 on Windows11

I tried changing some cuda version to see if it works.

11.8.0-1 NG (latest version)
11.7.0-1 NG
11.6.0-1 NG
11.3.0-1 OK
11.2.0-1 OK
11.1.1-1 OK
11.1.0-1 OK

I tried to build XLA with your docker image myself. but a compilation error occurred and I could not build. I'll make a separate issue for this.

$ sudo apt list -a cuda
Listing... Done
cuda/unknown,now 11.8.0-1 amd64 [installed]
cuda/unknown 11.7.1-1 amd64
cuda/unknown 11.7.0-1 amd64
cuda/unknown 11.6.2-1 amd64
cuda/unknown 11.6.1-1 amd64
cuda/unknown 11.6.0-1 amd64
cuda/unknown 11.5.2-1 amd64
cuda/unknown 11.5.1-1 amd64
cuda/unknown 11.5.0-1 amd64
cuda/unknown 11.4.4-1 amd64
cuda/unknown 11.4.3-1 amd64
cuda/unknown 11.4.2-1 amd64
cuda/unknown 11.4.1-1 amd64
cuda/unknown 11.4.0-1 amd64
cuda/unknown 11.3.1-1 amd64
cuda/unknown 11.3.0-1 amd64
cuda/unknown 11.2.2-1 amd64
cuda/unknown 11.2.1-1 amd64
cuda/unknown 11.2.0-1 amd64
cuda/unknown 11.1.1-1 amd64
cuda/unknown 11.1.0-1 amd64
$ export XLA_TARGET=cuda111
$ export XLA_BUILD=false
$ iex
Erlang/OTP 25 [erts-13.1.1] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

Interactive Elixir (1.14.0) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> Mix.install([
...(1)>   {:nx, "~> 0.3.0"},
...(1)>   {:exla, "~> 0.3.0"},
...(1)> ],
...(1)> config: [
...(1)>           nx: [
...(1)>             default_backend: EXLA.Backend,
...(1)>             default_defn_options: [compiler: EXLA],
...(1)>           ]
...(1)>         ]
...(1)> )
Resolving Hex dependencies...
Dependency resolution completed:
New:
  complex 0.4.2
  elixir_make 0.6.3
  exla 0.3.0
  nx 0.3.0
  xla 0.3.0
* Getting nx (Hex package)
* Getting exla (Hex package)
* Getting elixir_make (Hex package)
* Getting xla (Hex package)
* Getting complex (Hex package)
==> complex
Compiling 2 files (.ex)
Generated complex app
==> nx
Compiling 24 files (.ex)
Generated nx app
==> elixir_make
Compiling 1 file (.ex)
Generated elixir_make app
==> xla
Compiling 2 files (.ex)
Generated xla app

21:04:26.560 [info] Found a matching archive (xla_extension-x86_64-linux-cuda111.tar.gz), going to download it
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                                                                                                Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  204M  100  204M    0     0  7981k      0  0:00:26  0:00:26 --:--:-- 6268k

21:04:52.742 [info] Successfully downloaded the XLA archive
==> exla
Unpacking /home/masa/.cache/xla/0.3.0/cache/download/xla_extension-x86_64-linux-cuda111.tar.gz into /home/masa/.cache/mix/installs/elixir-1.14.0-erts-13.1.1/c709e1c9414e09e688e98d555e300276/deps/exla/cache
g++ -fPIC -I/home/masa/.asdf/installs/erlang/25.1.1/erts-13.1.1/include -Icache/xla_extension/include -O3 -Wall -Wno-sign-compare -Wno-unused-parameter -Wno-missing-field-initializers -Wno-comment -shared -std=c++14 c_src/exla/exla.cc c_src/exla/exla_nif_util.cc c_src/exla/exla_client.cc -o cache/libexla.so -Lcache/xla_extension/lib -lxla_extension -Wl,-rpath,'$ORIGIN/lib'
Compiling 21 files (.ex)
Generated exla app
:ok
iex(2)> Nx.add(Nx.tensor([1]), Nx.tensor([1]))

21:05:33.943 [info] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

21:05:33.943 [info] XLA service 0x7efc2c5ffda0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

21:05:33.943 [info]   StreamExecutor device (0): NVIDIA GeForce RTX 3060, Compute Capability 8.6

21:05:33.943 [info] Using BFC allocator.

21:05:33.943 [info] XLA backend allocating 10641368678 bytes on device 0 for BFCAllocator.

21:05:34.312 [info] Start cannot spawn child process: No such file or directory

21:05:34.312 [error] cuLinkAddData fails. This is usually caused by stale driver version.

21:05:34.312 [error] The CUDA linking API did not work. Please use XLA_FLAGS=--xla_gpu_force_compilation_parallelism=1 to bypass it, but expect to get longer compilation time due to the lack of multi-threading.
** (RuntimeError) no kernel image is available for execution on the device
in tensorflow/stream_executor/cuda/cuda_asm_compiler.cc(65): 'status'
    (exla 0.3.0) lib/exla/computation.ex:92: EXLA.Computation.unwrap!/1
    (exla 0.3.0) lib/exla/computation.ex:61: EXLA.Computation.compile/4
    (exla 0.3.0) lib/exla/defn.ex:396: anonymous fn/9 in EXLA.Defn.compile/7
    (exla 0.3.0) lib/exla/defn/locked_cache.ex:36: EXLA.Defn.LockedCache.run/2
    (stdlib 4.1.1) timer.erl:235: :timer.tc/1
    (exla 0.3.0) lib/exla/defn.ex:383: EXLA.Defn.compile/7
    (exla 0.3.0) lib/exla/defn.ex:251: EXLA.Defn.__compile__/4
    iex:2: (file)
iex(2)>

jonatanklosko commented 1 year ago

Hey @masahiro-999! CUDA minor versions should be compatible starting from 11.1, that's why we precompile using the lowest version. Perhaps this is an XLA issue, I will build a binary using a newer version of XLA and we can test if that helps.

seanmor5 commented 1 year ago

@masahiro-999 Do you get similar errors when running TensorFlow? This seems like it may be a specific issue with CUDA on WSL2 and TensorFlow: https://forums.developer.nvidia.com/t/windows-11-wsl2-cuda-windows-11-home-22000-708-nvidia-studio-driver-512-96/217721/3

masahiro-999 commented 1 year ago

@seanmor5 I tryed to check if TensorFlow is running or not. It is running. and I didn't get similar error.

masa@DESKTOP-HP:~$ python3 -c 'import tensorflow as tf; print(tf.config.list_physical_devices("GPU"))'
2022-10-23 08:08:04.798512: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2022-10-23 08:08:04.911764: E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2022-10-23 08:08:05.338207: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-10-23 08:08:05.338276: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-10-23 08:08:05.338295: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
2022-10-23 08:08:05.883052: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:966] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-10-23 08:08:05.909657: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:966] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2022-10-23 08:08:05.910199: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:966] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]

Command history

    4  sudo apt-key del 7fa2af80
    5  wget https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/cuda-wsl-ubuntu.pin
    6  sudo mv cuda-wsl-ubuntu.pin /etc/apt/preferences.d/cuda-repository-pin-600
    7  sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/3bf863cc.pub
    8  sudo add-apt-repository 'deb https://developer.download.nvidia.com/compute/cuda/repos/wsl-ubuntu/x86_64/ /'
    9  sudo apt-get update
   10  sudo apt-get -y install cuda
   11  nvidia-smi
   12  apt list cuda
   13  sudo apt install python3-pip
   14  pip3 install --upgrade pip
   15  pip install tensorflow
   16  sudo apt-get install zlib1g
   17  sudo dpkg -i ./cudnn-local-repo-ubuntu2004-8.4.1.50_1.0-1_amd64.deb
   18  pwd
   19  ls
   20  wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/libcudnn8-dev_8.6.0.163-1+cuda11.8_amd64.deb
   21  wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/libcudnn8_8.6.0.163-1+cuda11.8_amd64.deb
   22  ls -la
   23  sudo apt install ./libcudnn8-dev_8.6.0.163-1+cuda11.8_amd64.deb ./libcudnn8_8.6.0.163-1+cuda11.8_amd64.deb
   24  python3 -c “import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))”
   25  python3
   26  python3 -c 'import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))'
   27  python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices(‘GPU’))"
   28  python3 -c 'import tensorflow as tf; print(tf.config.list_physical_devices("GPU"))'

jonatanklosko commented 1 year ago

@masahiro-999 please try this one:

System.put_env(
  "XLA_ARCHIVE_URL",
  "https://static.jonatanklosko.com/builds/xla_extension-x86_64-linux-cuda111-tf2.10.tar.gz"
)

Mix.install(
  [
    {:nx,
     github: "elixir-nx/nx",
     sparse: "nx",
     override: true,
     ref: "0aa593dd599f30d83abd40a22d2413d033003b87"},
    {:exla,
     github: "elixir-nx/nx",
     sparse: "exla",
     override: true,
     ref: "0aa593dd599f30d83abd40a22d2413d033003b87"}
  ],
  config: [
    nx: [
      default_backend: EXLA.Backend,
      default_defn_options: [compiler: EXLA]
    ]
  ]
)

masahiro-999 commented 1 year ago

@jonatanklosko This is the result.

$ iex
Erlang/OTP 25 [erts-13.1.1] [source] [64-bit] [smp:16:16] [ds:16:16:10] [async-threads:1] [jit:ns]

Interactive Elixir (1.14.0) - press Ctrl+C to exit (type h() ENTER for help)
iex(1)> System.put_env(
...(1)>   "XLA_ARCHIVE_URL",
...(1)>   "https://static.jonatanklosko.com/builds/xla_extension-x86_64-linux-cuda111-tf2.10.tar.gz"
...(1)> )
:ok
iex(2)> Mix.install(
...(2)>   [
...(2)>     {:nx,
...(2)>      github: "elixir-nx/nx",
...(2)>      sparse: "nx",
...(2)>      override: true,
...(2)>      ref: "0aa593dd599f30d83abd40a22d2413d033003b87"},
...(2)>     {:exla,
...(2)>      github: "elixir-nx/nx",
...(2)>      sparse: "exla",
...(2)>      override: true,
...(2)>      ref: "0aa593dd599f30d83abd40a22d2413d033003b87"}
...(2)>   ],
...(2)>   config: [
...(2)>     nx: [
...(2)>       default_backend: EXLA.Backend,
...(2)>       default_defn_options: [compiler: EXLA]
...(2)>     ]
...(2)>   ]
...(2)> )
* Getting nx (https://github.com/elixir-nx/nx.git - 0aa593dd599f30d83abd40a22d2413d033003b87)
remote: Enumerating objects: 15371, done.
remote: Counting objects: 100% (274/274), done.
remote: Compressing objects: 100% (162/162), done.
remote: Total 15371 (delta 138), reused 204 (delta 110), pack-reused 15097
Receiving objects: 100% (15371/15371), 4.46 MiB | 6.28 MiB/s, done.
Resolving deltas: 100% (10231/10231), done.
==> nx
Could not find Hex, which is needed to build dependency :complex
Shall I install Hex? (if running non-interactively, use "mix local.hex --force") [Yn] y
* creating /home/masa/.asdf/installs/elixir/1.14.0-otp-25/.mix/archives/hex-1.0.1
==> mix_install
* Getting exla (https://github.com/elixir-nx/nx.git - 0aa593dd599f30d83abd40a22d2413d033003b87)
remote: Enumerating objects: 15371, done.
remote: Counting objects: 100% (274/274), done.
remote: Compressing objects: 100% (162/162), done.
remote: Total 15371 (delta 138), reused 204 (delta 110), pack-reused 15097
Receiving objects: 100% (15371/15371), 4.48 MiB | 5.48 MiB/s, done.
Resolving deltas: 100% (10242/10242), done.
Resolving Hex dependencies...
Dependency resolution completed:
New:
  complex 0.4.2
  elixir_make 0.6.3
  xla 0.3.0
* Getting xla (Hex package)
* Getting elixir_make (Hex package)
* Getting complex (Hex package)
==> complex
Compiling 2 files (.ex)
Generated complex app
==> nx
Compiling 27 files (.ex)
Generated nx app
==> elixir_make
Compiling 1 file (.ex)
Generated elixir_make app
==> xla
Compiling 2 files (.ex)
Generated xla app

09:52:25.180 [info] Downloading XLA archive from https://static.jonatanklosko.com/builds/xla_extension-x86_64-linux-cuda111-tf2.10.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                                                                                                Dload  Upload   Total   Spent    Left  Speed
100  266M  100  266M    0     0  6593k      0  0:00:41  0:00:41 --:--:-- 13.3M

09:53:06.518 [info] Successfully downloaded the XLA archive
==> exla
Unpacking /home/masa/.cache/xla/0.3.0/cache/external/xla_extension-nj36knfmas3bodi6yz3n2gmzjy.tar.gz into /home/masa/.cache/mix/installs/elixir-1.14.0-erts-13.1.1/23eb0e12b34f6d8d441e34a5ee08c316/deps/exla/exla/cache
g++ -fPIC -I/home/masa/.asdf/installs/erlang/25.1.1/erts-13.1.1/include -Icache/xla_extension/include -O3 -Wall -Wno-sign-compare -Wno-unused-parameter -Wno-missing-field-initializers -Wno-comment -shared -std=c++17 c_src/exla/exla.cc c_src/exla/exla_nif_util.cc c_src/exla/exla_client.cc -o cache/libexla.so -Lcache/xla_extension/lib -lxla_extension -Wl,-rpath,'$ORIGIN/lib'
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_opcode.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/dfs_hlo_visitor.h:26,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:98:71: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
   98 |       "PrimitiveType)")]] explicit Comparison(Direction dir, Type type);
      |                                                                       ^
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:105:7: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  105 |       const {
      |       ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:216:43: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  216 |   DefaultComparisonType(PrimitiveType type);
      |                                           ^
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:227:29: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  227 |   [[deprecated]] const Type type_;
      |                             ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h: In member function ‘xla::Comparison::Type xla::Comparison::GetType() const’:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:106:12: warning: ‘xla::Comparison::type_’ is deprecated [-Wdeprecated-declarations]
  106 |     return type_;
      |            ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:227:29: note: declared here
  227 |   [[deprecated]] const Type type_;
      |                             ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:106:12: warning: ‘xla::Comparison::type_’ is deprecated [-Wdeprecated-declarations]
  106 |     return type_;
      |            ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:227:29: note: declared here
  227 |   [[deprecated]] const Type type_;
      |                             ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h: At global scope:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:238:57: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  238 | std::string ComparisonTypeToString(Comparison::Type type);
      |                                                         ^
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:244:22: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  244 | StatusOr<Comparison::Type> StringToComparisonType(absl::string_view comparison);
      |                      ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_computation.h:37,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_instruction.h:655:33: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  655 |       std::optional<Comparison::Type> type = std::nullopt);
      |                                 ^~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_opcode.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/dfs_hlo_visitor.h:26,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
In file included from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h: In member function ‘tensorflow::Status xla::PjRtBuffer::ToLiteralSync(xla::MutableLiteralBase*)’:
cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:825:6: warning: ‘void xla::PjRtBuffer::ToLiteral(xla::MutableLiteralBase*, std::function<void(tensorflow::Status)>)’ is deprecated: Use ToLiteral(...).OnReady() instead [-Wdeprecated-declarations]
  825 |     });
      |      ^
cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:813:8: note: declared here
  813 |   void ToLiteral(MutableLiteralBase* literal,
      |        ^~~~~~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_alias_analysis.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:30,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_buffer.h: In member function ‘xla::BufferValue::Color xla::HloBuffer::color() const’:
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_buffer.h:100:52: warning: ‘xla::BufferValue::Color xla::BufferValue::color() const’ is deprecated: Use Layout::memory_space instead. [-Wdeprecated-declarations]
  100 |     BufferValue::Color result = values()[0]->color();
      |                                                    ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:27,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_value.h:117:9: note: declared here
  117 |   Color color() const {
      |         ^~~~~
In file included from cache/xla_extension/include/tensorflow/core/platform/logging.h:27,
                 from cache/xla_extension/include/tensorflow/core/platform/status.h:29,
                 from cache/xla_extension/include/tensorflow/core/lib/core/status.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/status.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/util.h:39,
                 from cache/xla_extension/include/tensorflow/compiler/xla/layout.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/shape.h:26,
                 from c_src/exla/exla_nif_util.h:14,
                 from c_src/exla/exla.cc:3:
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_buffer.h:102:38: warning: ‘xla::BufferValue::Color xla::BufferValue::color() const’ is deprecated: Use Layout::memory_space instead. [-Wdeprecated-declarations]
  102 |       DCHECK_EQ(result, value->color());
      |                                      ^
cache/xla_extension/include/tensorflow/core/platform/default/logging.h:448:57: note: in definition of macro ‘CHECK_OP_LOG’
  448 |           ::tensorflow::internal::GetReferenceableValue(val2), \
      |                                                         ^~~~
cache/xla_extension/include/tensorflow/core/platform/default/logging.h:455:30: note: in expansion of macro ‘CHECK_OP’
  455 | #define CHECK_EQ(val1, val2) CHECK_OP(Check_EQ, ==, val1, val2)
      |                              ^~~~~~~~
cache/xla_extension/include/tensorflow/core/platform/default/logging.h:468:31: note: in expansion of macro ‘CHECK_EQ’
  468 | #define DCHECK_EQ(val1, val2) CHECK_EQ(val1, val2)
      |                               ^~~~~~~~
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_buffer.h:102:7: note: in expansion of macro ‘DCHECK_EQ’
  102 |       DCHECK_EQ(result, value->color());
      |       ^~~~~~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:27,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_value.h:117:9: note: declared here
  117 |   Color color() const {
      |         ^~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h: In lambda function:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:602:65: warning: ‘void xla::BufferValue::set_color(xla::BufferValue::Color)’ is deprecated: Use Layout::memory_space instead. [-Wdeprecated-declarations]
  602 |               defining_position.shape().layout().memory_space()));
      |                                                                 ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:27,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_value.h:124:8: note: declared here
  124 |   void set_color(Color color) {
      |        ^~~~~~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:604:49: warning: ‘void xla::BufferValue::set_color(xla::BufferValue::Color)’ is deprecated: Use Layout::memory_space instead. [-Wdeprecated-declarations]
  604 |           value->set_color(BufferValue::Color(0));
      |                                                 ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:27,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_value.h:124:8: note: declared here
  124 |   void set_color(Color color) {
      |        ^~~~~~~~~
In file included from c_src/exla/exla.cc:4:
c_src/exla/exla_client.h: In destructor ‘exla::ExlaBuffer::~ExlaBuffer()’:
c_src/exla/exla_client.h:40:60: warning: ‘tensorflow::Status xla::PjRtBuffer::BlockHostUntilReady()’ is deprecated: Use GetReadyFuture().Await() instead [-Wdeprecated-declarations]
   40 |     if(erlang_managed_) (void)buffer_->BlockHostUntilReady();
      |                                                            ^
In file included from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:952:10: note: declared here
  952 |   Status BlockHostUntilReady() {
      |          ^~~~~~~~~~~~~~~~~~~
In file included from c_src/exla/exla.cc:8:
cache/xla_extension/include/tensorflow/compiler/xla/client/xla_builder.h: At global scope:
cache/xla_extension/include/tensorflow/compiler/xla/client/xla_builder.h:958:44: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  958 |                  std::optional<Comparison::Type> type = std::nullopt);
      |                                            ^~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_opcode.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/dfs_hlo_visitor.h:26,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
In file included from c_src/exla/exla.cc:8:
cache/xla_extension/include/tensorflow/compiler/xla/client/xla_builder.h:966:56: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  966 |                                   Comparison::Type type);
      |                                                        ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_opcode.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/dfs_hlo_visitor.h:26,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
In file included from c_src/exla/exla.cc:8:
cache/xla_extension/include/tensorflow/compiler/xla/client/xla_builder.h:1153:53: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
 1153 |                        Comparison::Type compare_type);
      |                                                     ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_opcode.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/dfs_hlo_visitor.h:26,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
In file included from c_src/exla/exla.cc:8:
cache/xla_extension/include/tensorflow/compiler/xla/client/xla_builder.h:1937:75: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
 1937 |               ComparisonDirection direction, Comparison::Type compare_type);
      |                                                                           ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_opcode.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/dfs_hlo_visitor.h:26,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
In file included from /usr/include/unistd.h:226,
                 from cache/xla_extension/include/absl/base/internal/thread_identity.h:27,
                 from cache/xla_extension/include/absl/synchronization/mutex.h:68,
                 from cache/xla_extension/include/absl/strings/internal/cordz_handle.h:24,
                 from cache/xla_extension/include/absl/strings/internal/cordz_info.h:28,
                 from cache/xla_extension/include/absl/strings/cord.h:91,
                 from cache/xla_extension/include/tensorflow/core/platform/default/cord.h:22,
                 from cache/xla_extension/include/tensorflow/core/platform/cord.h:25,
                 from cache/xla_extension/include/tensorflow/core/platform/tstring.h:24,
                 from cache/xla_extension/include/tensorflow/core/platform/types.h:23,
                 from cache/xla_extension/include/tensorflow/core/platform/logging.h:20,
                 from cache/xla_extension/include/tensorflow/core/platform/status.h:29,
                 from cache/xla_extension/include/tensorflow/core/lib/core/status.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/status.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/util.h:39,
                 from cache/xla_extension/include/tensorflow/compiler/xla/layout.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/shape.h:26,
                 from c_src/exla/exla_nif_util.h:14,
                 from c_src/exla/exla.cc:3:
cache/xla_extension/include/tfrt/host_context/async_value.h: In instantiation of ‘static void tfrt::internal::ConcreteAsyncValue<T>::VerifyOffsets() [with T = tfrt::DummyValueForErrorAsyncValue]’:
cache/xla_extension/include/tfrt/host_context/async_value.h:503:18:   required from ‘tfrt::internal::ConcreteAsyncValue<T>::ConcreteAsyncValue(tfrt::DecodedDiagnostic) [with T = tfrt::DummyValueForErrorAsyncValue]’
cache/xla_extension/include/tfrt/host_context/async_value.h:701:34:   required from here
cache/xla_extension/include/tfrt/host_context/async_value.h:676:28: warning: offsetof within non-standard-layout type ‘tfrt::internal::ConcreteAsyncValue<tfrt::DummyValueForErrorAsyncValue>’ is conditionally-supported [-Winvalid-offsetof]
  676 |     static_assert(offsetof(ConcreteAsyncValue<T>, data_store_.data_) ==
      |                            ^
cache/xla_extension/include/tfrt/host_context/async_value.h:680:28: warning: offsetof within non-standard-layout type ‘tfrt::internal::ConcreteAsyncValue<tfrt::DummyValueForErrorAsyncValue>’ is conditionally-supported [-Winvalid-offsetof]
  680 |     static_assert(offsetof(ConcreteAsyncValue<T>, data_store_.error_) ==
      |                            ^
In file included from cache/xla_extension/include/tfrt/host_context/async_value_ref.h:35,
                 from cache/xla_extension/include/tfrt/host_context/host_context.h:27,
                 from cache/xla_extension/include/tfrt/host_context/async_dispatch.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_future.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:38,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tfrt/host_context/async_value.h: In member function ‘const char* tfrt::AsyncValue::State::DebugString() const’:
cache/xla_extension/include/tfrt/host_context/async_value.h:255:5: warning: control reaches end of non-void function [-Wreturn-type]
  255 |     }
      |     ^
cache/xla_extension/include/tfrt/host_context/async_value.h: In member function ‘const tfrt::DecodedDiagnostic* tfrt::AsyncValue::GetErrorIfPresent() const’:
cache/xla_extension/include/tfrt/host_context/async_value.h:900:1: warning: control reaches end of non-void function [-Wreturn-type]
  900 | }
      | ^
In file included from cache/xla_extension/include/tfrt/host_context/async_value.h:40,
                 from cache/xla_extension/include/tfrt/host_context/async_value_ref.h:35,
                 from cache/xla_extension/include/tfrt/host_context/host_context.h:27,
                 from cache/xla_extension/include/tfrt/host_context/async_dispatch.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_future.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:38,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla.cc:4:
cache/xla_extension/include/tfrt/support/logging.h: In destructor ‘tfrt::internal::LogStreamFatal::~LogStreamFatal()’:
cache/xla_extension/include/tfrt/support/logging.h:60:16: warning: ‘noreturn’ function does return
   60 |   [[noreturn]] ~LogStreamFatal() = default;
      |                ^
c_src/exla/exla_nif_util.cc: In function ‘int exla::nif::get_primitive_type(ErlNifEnv*, ERL_NIF_TERM, xla::PrimitiveType*)’:
c_src/exla/exla_nif_util.cc:454:43: warning: ‘T tensorflow::StatusOr<T>::ConsumeValueOrDie() [with T = xla::PrimitiveType]’ is deprecated: Use `value()` instead. [-Wdeprecated-declarations]
  454 |     *type = type_status.ConsumeValueOrDie();
      |                                           ^
In file included from cache/xla_extension/include/tensorflow/stream_executor/lib/statusor.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/statusor.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/status_macros.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/util.h:40,
                 from cache/xla_extension/include/tensorflow/compiler/xla/layout.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/shape.h:26,
                 from c_src/exla/exla_nif_util.h:14,
                 from c_src/exla/exla_nif_util.cc:1:
cache/xla_extension/include/tensorflow/core/platform/statusor.h:236:47: note: declared here
  236 |   T ABSL_DEPRECATED("Use `value()` instead.") ConsumeValueOrDie() {
      |                                               ^~~~~~~~~~~~~~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_opcode.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/dfs_hlo_visitor.h:26,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:98:71: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
   98 |       "PrimitiveType)")]] explicit Comparison(Direction dir, Type type);
      |                                                                       ^
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:105:7: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  105 |       const {
      |       ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:216:43: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  216 |   DefaultComparisonType(PrimitiveType type);
      |                                           ^
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:227:29: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  227 |   [[deprecated]] const Type type_;
      |                             ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h: In member function ‘xla::Comparison::Type xla::Comparison::GetType() const’:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:106:12: warning: ‘xla::Comparison::type_’ is deprecated [-Wdeprecated-declarations]
  106 |     return type_;
      |            ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:227:29: note: declared here
  227 |   [[deprecated]] const Type type_;
      |                             ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:106:12: warning: ‘xla::Comparison::type_’ is deprecated [-Wdeprecated-declarations]
  106 |     return type_;
      |            ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:227:29: note: declared here
  227 |   [[deprecated]] const Type type_;
      |                             ^~~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h: At global scope:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:238:57: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  238 | std::string ComparisonTypeToString(Comparison::Type type);
      |                                                         ^
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:244:22: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  244 | StatusOr<Comparison::Type> StringToComparisonType(absl::string_view comparison);
      |                      ^~~~
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_computation.h:37,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_instruction.h:655:33: warning: ‘Type’ is deprecated: Use PrimitiveType and Order [-Wdeprecated-declarations]
  655 |       std::optional<Comparison::Type> type = std::nullopt);
      |                                 ^~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_opcode.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/dfs_hlo_visitor.h:26,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_cost_analysis.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:39,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/comparison_util.h:76:60: note: declared here
   76 |   enum class [[deprecated("Use PrimitiveType and Order")]] Type : uint8_t{
      |                                                            ^~~~
In file included from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h: In member function ‘tensorflow::Status xla::PjRtBuffer::ToLiteralSync(xla::MutableLiteralBase*)’:
cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:825:6: warning: ‘void xla::PjRtBuffer::ToLiteral(xla::MutableLiteralBase*, std::function<void(tensorflow::Status)>)’ is deprecated: Use ToLiteral(...).OnReady() instead [-Wdeprecated-declarations]
  825 |     });
      |      ^
cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:813:8: note: declared here
  813 |   void ToLiteral(MutableLiteralBase* literal,
      |        ^~~~~~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_alias_analysis.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:30,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_buffer.h: In member function ‘xla::BufferValue::Color xla::HloBuffer::color() const’:
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_buffer.h:100:52: warning: ‘xla::BufferValue::Color xla::BufferValue::color() const’ is deprecated: Use Layout::memory_space instead. [-Wdeprecated-declarations]
  100 |     BufferValue::Color result = values()[0]->color();
      |                                                    ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:27,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_value.h:117:9: note: declared here
  117 |   Color color() const {
      |         ^~~~~
In file included from cache/xla_extension/include/tensorflow/core/platform/logging.h:27,
                 from cache/xla_extension/include/tensorflow/core/platform/status.h:29,
                 from cache/xla_extension/include/tensorflow/core/lib/core/status.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/status.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/util.h:39,
                 from cache/xla_extension/include/tensorflow/compiler/xla/layout.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/shape.h:26,
                 from c_src/exla/exla_nif_util.h:14,
                 from c_src/exla/exla_client.h:8,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_buffer.h:102:38: warning: ‘xla::BufferValue::Color xla::BufferValue::color() const’ is deprecated: Use Layout::memory_space instead. [-Wdeprecated-declarations]
  102 |       DCHECK_EQ(result, value->color());
      |                                      ^
cache/xla_extension/include/tensorflow/core/platform/default/logging.h:448:57: note: in definition of macro ‘CHECK_OP_LOG’
  448 |           ::tensorflow::internal::GetReferenceableValue(val2), \
      |                                                         ^~~~
cache/xla_extension/include/tensorflow/core/platform/default/logging.h:455:30: note: in expansion of macro ‘CHECK_OP’
  455 | #define CHECK_EQ(val1, val2) CHECK_OP(Check_EQ, ==, val1, val2)
      |                              ^~~~~~~~
cache/xla_extension/include/tensorflow/core/platform/default/logging.h:468:31: note: in expansion of macro ‘CHECK_EQ’
  468 | #define DCHECK_EQ(val1, val2) CHECK_EQ(val1, val2)
      |                               ^~~~~~~~
cache/xla_extension/include/tensorflow/compiler/xla/service/hlo_buffer.h:102:7: note: in expansion of macro ‘DCHECK_EQ’
  102 |       DCHECK_EQ(result, value->color());
      |       ^~~~~~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:27,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_value.h:117:9: note: declared here
  117 |   Color color() const {
      |         ^~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h: In lambda function:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:602:65: warning: ‘void xla::BufferValue::set_color(xla::BufferValue::Color)’ is deprecated: Use Layout::memory_space instead. [-Wdeprecated-declarations]
  602 |               defining_position.shape().layout().memory_space()));
      |                                                                 ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:27,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_value.h:124:8: note: declared here
  124 |   void set_color(Color color) {
      |        ^~~~~~~~~
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:604:49: warning: ‘void xla::BufferValue::set_color(xla::BufferValue::Color)’ is deprecated: Use Layout::memory_space instead. [-Wdeprecated-declarations]
  604 |           value->set_color(BufferValue::Color(0));
      |                                                 ^
In file included from cache/xla_extension/include/tensorflow/compiler/xla/service/heap_simulator.h:27,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_assignment.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/service/compiler.h:31,
                 from cache/xla_extension/include/tensorflow/compiler/xla/client/local_client.h:28,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.h:34,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/gpu_device.h:26,
                 from c_src/exla/exla_client.h:12,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/service/buffer_value.h:124:8: note: declared here
  124 |   void set_color(Color color) {
      |        ^~~~~~~~~
In file included from c_src/exla/exla_client.cc:1:
c_src/exla/exla_client.h: In destructor ‘exla::ExlaBuffer::~ExlaBuffer()’:
c_src/exla/exla_client.h:40:60: warning: ‘tensorflow::Status xla::PjRtBuffer::BlockHostUntilReady()’ is deprecated: Use GetReadyFuture().Await() instead [-Wdeprecated-declarations]
   40 |     if(erlang_managed_) (void)buffer_->BlockHostUntilReady();
      |                                                            ^
In file included from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:952:10: note: declared here
  952 |   Status BlockHostUntilReady() {
      |          ^~~~~~~~~~~~~~~~~~~
In file included from /usr/include/unistd.h:226,
                 from cache/xla_extension/include/absl/base/internal/thread_identity.h:27,
                 from cache/xla_extension/include/absl/synchronization/mutex.h:68,
                 from cache/xla_extension/include/absl/strings/internal/cordz_handle.h:24,
                 from cache/xla_extension/include/absl/strings/internal/cordz_info.h:28,
                 from cache/xla_extension/include/absl/strings/cord.h:91,
                 from cache/xla_extension/include/tensorflow/core/platform/default/cord.h:22,
                 from cache/xla_extension/include/tensorflow/core/platform/cord.h:25,
                 from cache/xla_extension/include/tensorflow/core/platform/tstring.h:24,
                 from cache/xla_extension/include/tensorflow/core/platform/types.h:23,
                 from cache/xla_extension/include/tensorflow/core/platform/logging.h:20,
                 from cache/xla_extension/include/tensorflow/core/platform/status.h:29,
                 from cache/xla_extension/include/tensorflow/core/lib/core/status.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/status.h:19,
                 from cache/xla_extension/include/tensorflow/compiler/xla/util.h:39,
                 from cache/xla_extension/include/tensorflow/compiler/xla/layout.h:25,
                 from cache/xla_extension/include/tensorflow/compiler/xla/shape.h:26,
                 from c_src/exla/exla_nif_util.h:14,
                 from c_src/exla/exla_client.h:8,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tfrt/host_context/async_value.h: In instantiation of ‘static void tfrt::internal::ConcreteAsyncValue<T>::VerifyOffsets() [with T = tfrt::DummyValueForErrorAsyncValue]’:
cache/xla_extension/include/tfrt/host_context/async_value.h:503:18:   required from ‘tfrt::internal::ConcreteAsyncValue<T>::ConcreteAsyncValue(tfrt::DecodedDiagnostic) [with T = tfrt::DummyValueForErrorAsyncValue]’
cache/xla_extension/include/tfrt/host_context/async_value.h:701:34:   required from here
cache/xla_extension/include/tfrt/host_context/async_value.h:676:28: warning: offsetof within non-standard-layout type ‘tfrt::internal::ConcreteAsyncValue<tfrt::DummyValueForErrorAsyncValue>’ is conditionally-supported [-Winvalid-offsetof]
  676 |     static_assert(offsetof(ConcreteAsyncValue<T>, data_store_.data_) ==
      |                            ^
cache/xla_extension/include/tfrt/host_context/async_value.h:680:28: warning: offsetof within non-standard-layout type ‘tfrt::internal::ConcreteAsyncValue<tfrt::DummyValueForErrorAsyncValue>’ is conditionally-supported [-Winvalid-offsetof]
  680 |     static_assert(offsetof(ConcreteAsyncValue<T>, data_store_.error_) ==
      |                            ^
In file included from cache/xla_extension/include/tfrt/host_context/async_value_ref.h:35,
                 from cache/xla_extension/include/tfrt/host_context/host_context.h:27,
                 from cache/xla_extension/include/tfrt/host_context/async_dispatch.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_future.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:38,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tfrt/host_context/async_value.h: In member function ‘const char* tfrt::AsyncValue::State::DebugString() const’:
cache/xla_extension/include/tfrt/host_context/async_value.h:255:5: warning: control reaches end of non-void function [-Wreturn-type]
  255 |     }
      |     ^
cache/xla_extension/include/tfrt/host_context/async_value.h: In member function ‘const tfrt::DecodedDiagnostic* tfrt::AsyncValue::GetErrorIfPresent() const’:
cache/xla_extension/include/tfrt/host_context/async_value.h:900:1: warning: control reaches end of non-void function [-Wreturn-type]
  900 | }
      | ^
In file included from cache/xla_extension/include/tfrt/host_context/async_value.h:40,
                 from cache/xla_extension/include/tfrt/host_context/async_value_ref.h:35,
                 from cache/xla_extension/include/tfrt/host_context/host_context.h:27,
                 from cache/xla_extension/include/tfrt/host_context/async_dispatch.h:23,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_future.h:24,
                 from cache/xla_extension/include/tensorflow/compiler/xla/pjrt/pjrt_client.h:38,
                 from c_src/exla/exla_client.h:11,
                 from c_src/exla/exla_client.cc:1:
cache/xla_extension/include/tfrt/support/logging.h: In destructor ‘tfrt::internal::LogStreamFatal::~LogStreamFatal()’:
cache/xla_extension/include/tfrt/support/logging.h:60:16: warning: ‘noreturn’ function does return
   60 |   [[noreturn]] ~LogStreamFatal() = default;
      |                ^
Compiling 21 files (.ex)
Generated exla app
:ok
iex(3)> Nx.add(Nx.tensor([1]), Nx.tensor([1]))

09:53:53.863 [info] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.

09:53:53.863 [info] XLA service 0x7fe954033dd0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:

09:53:53.863 [info]   StreamExecutor device (0): NVIDIA GeForce RTX 3060, Compute Capability 8.6

09:53:53.863 [info] Using BFC allocator.

09:53:53.863 [info] XLA backend allocating 10641368678 bytes on device 0 for BFCAllocator.
Segmentation fault

jonatanklosko commented 1 year ago

FTR I don't have CUDA installed, but I could still reproduce the error by using the cuda111 binary and :host client. I also built a binary using CUDA 11.8 Docker image and still got the segfault. In both cases I build against XLA from tensorflow 2.10. The non-cuda binary works fine, so there may be a bug with the conditional compilation in XLA (or maybe we need some adjustments on our side).

seanmor5 commented 1 year ago

I think this issue is resolved in the newest release. Can you confirm @masahiro-999 ?

josevalim commented 1 year ago

Closing this for now. We are shipping more precompiled cuda versions now.

elixir-nx / xla

Precompiled cuda111 doesn't work with latest cuda #23