tensorflow / tensorflow

An Open Source Machine Learning Framework for Everyone
https://tensorflow.org
Apache License 2.0
185.85k stars 74.23k forks source link

Crash in `tf.raw_ops.SparseCountSparseOutput ` #69455

Open x0w3n opened 3 months ago

x0w3n commented 3 months ago

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

Yes

Source

source

TensorFlow version

TensorFlow Nightly

Custom code

Yes

OS platform and distribution

Linux Ubuntu 22.04.3 LTS (x86_64)

Mobile device

No response

Python version

3.9.13

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

On specific input, tf.raw_ops.SparseCountSparseOutput will output "The session crashed because it took up all available RAM."

Standalone code to reproduce the issue

import tensorflow as tf

binary_output = True
indices = tf.constant(0, shape=[3456,2], dtype=tf.int64)
values = tf.constant(536870912, shape=[3456], dtype=tf.int32)
dense_shape = tf.constant([125099989676412,125099989676412], shape=[2], dtype=tf.int64)
weights = tf.constant(51, shape=[3456], dtype=tf.int32)

tf.raw_ops.SparseCountSparseOutput(
   indices=indices, values=values, dense_shape=dense_shape, weights=weights, binary_output=binary_output,
    minlength=0,
    maxlength=0,
    name=None
)

Relevant log output

Jun 10, 2024, 11:31:57 AM   WARNING WARNING:root:kernel 7c5d2f95-9ab7-4b7c-99c0-d8949477c9f1 restarted
Jun 10, 2024, 11:31:57 AM   INFO    KernelRestarter: restarting kernel (1/5), keep random ports
Jun 10, 2024, 11:31:29 AM   WARNING 2024-06-10 03:31:29.145870: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Could not find TensorRT
Jun 10, 2024, 11:31:26 AM   WARNING To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Jun 10, 2024, 11:31:26 AM   WARNING 2024-06-10 03:31:26.481147: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
Jun 10, 2024, 11:31:26 AM   WARNING 2024-06-10 03:31:26.469832: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
Jun 10, 2024, 11:31:26 AM   WARNING 2024-06-10 03:31:26.467591: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
Jun 10, 2024, 11:31:26 AM   WARNING 2024-06-10 03:31:26.467524: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
Jun 10, 2024, 11:31:18 AM   INFO    Kernel started: 7c5d2f95-9ab7-4b7c-99c0-d8949477c9f1, name: python3
Jun 10, 2024, 11:30:19 AM   INFO    Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).
tilakrayal commented 3 months ago

I was able to reproduce the issue on TensorFlow v2.15, v2.16 and tf-nightly. Kindly find the gist and also the screenshot for the reference.

image