tf.nn.ctc_loss behaviour changes depending on whether dense or sparse labels are provided

f90 commented 2 years ago

The tf.nn.ctc_loss changes its behaviour unexpectedly based on whether the labels provided are sparse or dense.

In particular, it accepts empty target label sequences in the dense version, and maximises (as one would expect) the log probability of the blank token at each time frame in the logits prediction matrix. It does not output any warning or error.

However, if you convert the label to be sparse by using tf.sparse_from_dense, given the same inputs (with empty target label sequence), it errors out with

tensorflow.python.framework.errors_impl.InvalidArgumentError: Labels length is zero in batch 0 [Op:CTCLoss]

This inconsistency in behaviour should be fixed.

Option 1: Ideally, we would implement support for empty labels in the case of sparse tensors.

Option 2: Add a warning to the CTC loss doc that empty labels are not supported in the case of sparse tensors.

Another inconsistency is that the dtype for logit_length can be int64 for the dense version, but needs to be int32 for the sparse version, and errors out if int64 is provided. Again, ideally this should be converted internally, but if it really cannot be fixed, the docs should reflect this, which they do not at the moment.

Function affected:

https://www.tensorflow.org/api_docs/python/tf/nn/ctc_loss

Usage example

This runs fine:

tf.nn.ctc_loss(
    labels=tf.ones((4, 50), dtype=tf.int32),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(50, dtype=tf.int64, shape=4),
    logit_length=tf.constant(200, dtype=tf.int64, shape=4),
    logits_time_major=False,
    blank_index=0,
)

Converting to sparse labels:

tf.nn.ctc_loss(
    labels=tf.sparse.from_dense(tf.ones((4, 50), dtype=tf.int32)),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(50, dtype=tf.int64, shape=4),
    logit_length=tf.constant(200, dtype=tf.int64, shape=4),
    logits_time_major=False,
    blank_index=0,
)

suddenly raises

tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot compute CTCLoss as input #3(zero-based) was expected to be a int32 tensor but is a int64 tensor [Op:CTCLoss]

which is fixed by running

tf.nn.ctc_loss(
    labels=tf.sparse.from_dense(tf.ones((4, 50), dtype=tf.int32)),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(50, dtype=tf.int64, shape=4),
    logit_length=tf.cast(tf.constant(200, dtype=tf.int64, shape=4), tf.int32),
    logits_time_major=False,
    blank_index=0,
)

Now introducing empty label sequences:

tf.nn.ctc_loss(
    labels=tf.sparse.from_dense(tf.ones((4, 0), dtype=tf.int32)),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(0, dtype=tf.int64, shape=4),
    logit_length=tf.cast(tf.constant(200, dtype=tf.int64, shape=4), tf.int32),
    logits_time_major=False,
    blank_index=0,
)

raises

tensorflow.python.framework.errors_impl.InvalidArgumentError: Labels length is zero in batch 0 [Op:CTCLoss]

wheras the dense version is fine:

tf.nn.ctc_loss(
    labels=tf.ones((4, 0), dtype=tf.int32),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(0, dtype=tf.int64, shape=4),
    logit_length=tf.cast(tf.constant(200, dtype=tf.int64, shape=4), tf.int32),
    logits_time_major=False,
    blank_index=0,
)

sushreebarsa commented 2 years ago

@f90 Could you please have a look at the gist here in which code fails when converting to sparse labels, and working fine with empty labels ? Please confirm the same. Thanks!

f90 commented 2 years ago

Yes, can confirm, I get

InvalidArgumentError: cannot compute CTCLossV2 as input #3(zero-based) was expected to be a int32 tensor but is a int64 tensor [Op:CTCLossV2]

for box 4 (with sparse, non-empty labels)

Everything else runs fine. However, you didn't include the run with sparse and empty labels that I talked about above

tf.nn.ctc_loss(
    labels=tf.sparse.from_dense(tf.ones((4, 50), dtype=tf.int32)),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(50, dtype=tf.int64, shape=4),
    logit_length=tf.cast(tf.constant(200, dtype=tf.int64, shape=4), tf.int32),
    logits_time_major=False,
    blank_index=0,
)

sushreebarsa commented 2 years ago

@f90 Thank you for the quick update! I have included the run with sparse and empty labels ,please find the gist here. Thanks!

f90 commented 2 years ago

No problem. I actually still see only the dense empty labels in the notebook. Do you want to link to the cell that you mean?

gadagashwini commented 2 years ago

@f90, I could able to reproduce the issue.Could you please confirm.

Sparse Tensors

import tensorflow as tf
tf.nn.ctc_loss(
    labels=tf.sparse.from_dense(tf.ones((4, 50), dtype=tf.int32)),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(50, dtype=tf.int64, shape=4),
    logit_length=tf.constant(200, dtype=tf.int64, shape=4),
    logits_time_major=False,
    blank_index=0,
)

Output

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
[<ipython-input-2-70fe3e7f7321>](https://localhost:8080/#) in <module>()
      6     logit_length=tf.constant(200, dtype=tf.int64, shape=4),
      7     logits_time_major=False,
----> 8     blank_index=0,
      9 )

1 frames
[/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in raise_from_not_ok_status(e, name)
   7184 def raise_from_not_ok_status(e, name):
   7185   e.message += (" name: " + name if name is not None else "")
-> 7186   raise core._status_to_exception(e) from None  # pylint: disable=protected-access
   7187 
   7188 

InvalidArgumentError: cannot compute CTCLossV2 as input #3(zero-based) was expected to be a int32 tensor but is a int64 tensor [Op:CTCLossV2]

Sparse tensor with cast int32

import tensorflow as tf
tf.nn.ctc_loss(
    labels=tf.sparse.from_dense(tf.ones((4, 50), dtype=tf.int32)),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(50, dtype=tf.int64, shape=4),
    logit_length=tf.cast(tf.constant(200, dtype=tf.int64, shape=4), tf.int32),
    logits_time_major=False,
    blank_index=0,
)

Output <tf.Tensor: shape=(4,), dtype=float32, numpy=array([324.07513, 324.07513, 324.07513, 324.07513], dtype=float32)>

Sparse tensor with empty labels

import tensorflow as tf
tf.nn.ctc_loss(
    labels=tf.sparse.from_dense(tf.ones((4, 0), dtype=tf.int32)),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(0, dtype=tf.int64, shape=4),
    logit_length=tf.cast(tf.constant(200, dtype=tf.int64, shape=4), tf.int32),
    logits_time_major=False,
    blank_index=0,
)

Output <tf.Tensor: shape=(4,), dtype=float32, numpy=array([458.2139, 458.2139, 458.2139, 458.2139], dtype=float32)>

Dense tensor with Empty labels

import tensorflow as tf
tf.nn.ctc_loss(
    labels=tf.ones((4, 0), dtype=tf.int32),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(0, dtype=tf.int64, shape=4),
    logit_length=tf.constant(200, dtype=tf.int64, shape=4),
    logits_time_major=False,
    blank_index=0,
)

Output <tf.Tensor: shape=(4,), dtype=float32, numpy=array([459.82333, 459.82333, 459.82333, 459.82333], dtype=float32)>

Issue is with logit_length, It expects int32 Tensor.

f90 commented 2 years ago

@gadagashwini I can confirm these results with one exception: Running empty sparse labels does not work on my end:

import tensorflow as tf
tf.nn.ctc_loss(
    labels=tf.sparse.from_dense(tf.ones((4, 0), dtype=tf.int32)),
    logits=tf.zeros((4, 200, 10), dtype=tf.float32),
    label_length=tf.constant(0, dtype=tf.int64, shape=4),
    logit_length=tf.cast(tf.constant(200, dtype=tf.int64, shape=4), tf.int32),
    logits_time_major=False,
    blank_index=0,
)

gives me

2022-04-05 11:59:04.820963: W ./tensorflow/core/util/ctc/ctc_loss_calculator.h:499] No valid path found.
2022-04-05 11:59:04.822817: W tensorflow/core/framework/op_kernel.cc:1745] OP_REQUIRES failed at ctc_loss_op.cc:213 : INVALID_ARGUMENT: Labels length is zero in batch 0
Traceback (most recent call last):
  File "/Users/dstoller/PycharmProjects/lyric-align-baseline/lyric_align_baseline/ctc_example.py", line 32, in <module>
    blank_index=0,
  File "/Users/dstoller/.pyenv/versions/lyric-align-baseline/lib/python3.7/site-packages/tensorflow/python/util/traceback_utils.py", line 153, in error_handler
    raise e.with_traceback(filtered_tb) from None
  File "/Users/dstoller/.pyenv/versions/lyric-align-baseline/lib/python3.7/site-packages/tensorflow/python/framework/ops.py", line 7186, in raise_from_not_ok_status
    raise core._status_to_exception(e) from None  # pylint: disable=protected-access
tensorflow.python.framework.errors_impl.InvalidArgumentError: Labels length is zero in batch 0 [Op:CTCLoss]

tensorflow / tensorflow

tf.nn.ctc_loss behaviour changes depending on whether dense or sparse labels are provided #55328

Function affected:

Usage example