keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
62.05k stars 19.48k forks source link

`inputs` argument cannot be empty #19816

Closed walkadda closed 4 months ago

walkadda commented 5 months ago

Environment:

    Python 3.12.3 on MacOS Sonoma 14.5
    VSCode (Latest Version)
    Tensorflow 2.16.1
    Matplotlib 3.9.0
    Keras 3.3.3

I've been following this guide from Keras' official site to create a handwritten text recognition model (HTR) in Python https://keras.io/examples/vision/handwriting_recognition/ but have run into some issues opening & reading the dataset from this GitHub download link (https://github.com/sayakpaul/Handwriting-Recognizer-in-Keras/releases/download/v1.0.0/IAM_Words.zip).

I've updated the code in a few places if and when I spot problems, but it's failing when the model training starts:

2024-06-02 19:26:11.552695: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
Traceback (most recent call last):
  File "/Volumes/Ugreen SSD/Py/keras-HTR-ENG/model.py", line 355, in <module>
    prediction_model = keras.models.Model(
                       ^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Ugreen SSD/Py/keras-HTR-ENG/.venv/lib/python3.12/site-packages/keras/src/models/model.py", line 143, in __new__
    return functional.Functional(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Ugreen SSD/Py/keras-HTR-ENG/.venv/lib/python3.12/site-packages/keras/src/utils/tracking.py", line 26, in wrapper
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/Volumes/Ugreen SSD/Py/keras-HTR-ENG/.venv/lib/python3.12/site-packages/keras/src/models/functional.py", line 162, in __init__
    Function.__init__(self, inputs, outputs, name=name, **kwargs)
  File "/Volumes/Ugreen SSD/Py/keras-HTR-ENG/.venv/lib/python3.12/site-packages/keras/src/ops/function.py", line 62, in __init__
    raise ValueError(
ValueError: `inputs` argument cannot be empty. Received:
inputs=[]
outputs=<KerasTensor shape=(None, 32, 79), dtype=float32, sparse=False, name=keras_tensor_20>

This is the problematic code:

class EditDistanceCallback(keras.callbacks.Callback):
    def __init__(self, pred_model):
        super().__init__()
        self.prediction_model = pred_model

    def on_epoch_end(self, epoch, logs=None):
        edit_distances = []

        for i in range(len(validation_images)):
            labels = validation_labels[i]
            predictions = self.prediction_model.predict(validation_images[i])
            edit_distances.append(calculate_edit_distance(labels, predictions).numpy())

        print(
            f"Mean edit distance for epoch {epoch + 1}: {np.mean(edit_distances):.4f}"
        )
epochs = 10  # To get good results this should be at least 50.

model = build_model()
prediction_model = keras.models.Model(
    model.get_layer(name="image").input, model.get_layer(name="dense2").output
)
edit_distance_callback = EditDistanceCallback(prediction_model)

Train the model.

history = model.fit(
    train_ds,
    validation_data=validation_ds,
    epochs=epochs,
    callbacks=[edit_distance_callback],
)

All of the pictures are .png and haven't been edited from the original dataset. I've also downloaded the dataset from a few different places just to test if they've become corrupted.

Here's the code that builds the model:

def build_model():
    # Inputs to the model
    input_img = keras.Input(shape=(image_width, image_height, 1), name="image")
    labels = keras.layers.Input(name="label", shape=(None,))

    # First conv block.
    x = keras.layers.Conv2D(
        32,
        (3, 3),
        activation="relu",
        kernel_initializer="he_normal",
        padding="same",
        name="Conv1",
    )(input_img)
    x = keras.layers.MaxPooling2D((2, 2), name="pool1")(x)

    # Second conv block.
    x = keras.layers.Conv2D(
        64,
        (3, 3),
        activation="relu",
        kernel_initializer="he_normal",
        padding="same",
        name="Conv2",
    )(x)
    x = keras.layers.MaxPooling2D((2, 2), name="pool2")(x)

    # We have used two max pool with pool size and strides 2.
    # Hence, downsampled feature maps are 4x smaller. The number of
    # filters in the last layer is 64. Reshape accordingly before
    # passing the output to the RNN part of the model.
    new_shape = ((image_width // 4), (image_height // 4) * 64)
    x = keras.layers.Reshape(target_shape=new_shape, name="reshape")(x)
    x = keras.layers.Dense(64, activation="relu", name="dense1")(x)
    x = keras.layers.Dropout(0.2)(x)

    # RNNs.
    x = keras.layers.Bidirectional(
        keras.layers.LSTM(128, return_sequences=True, dropout=0.25)
    )(x)
    x = keras.layers.Bidirectional(
        keras.layers.LSTM(64, return_sequences=True, dropout=0.25)
    )(x)

    # +2 is to account for the two special tokens introduced by the CTC loss.
    # The recommendation comes here: https://git.io/J0eXP.
    x = keras.layers.Dense(
        len(char_to_num.get_vocabulary()) + 2, activation="softmax", name="dense2"
    )(x)

    # Add CTC layer for calculating CTC loss at each step.
    output = CTCLayer(name="ctc_loss")(labels, x)

    # Define the model.
    model = keras.models.Model(
        inputs=[input_img, labels], outputs=output, name="handwriting_recognizer"
    )
    # Optimizer.
    opt = keras.optimizers.Adam()
    # Compile the model and return.
    model.compile(optimizer=opt)
    return model

# Get the model.
model = build_model()
model.summary()

There's probably important info that I've missed, but I've added all I can think to at the moment. Thank you for all the help.

mehtamansi29 commented 5 months ago

Hi @walkadda -

Thanks for reporting the issue. I have tested the code snippet and reproduces the reported behaviour. Attached gist file for reference. We will look into the issue and update you the same.

mehtamansi29 commented 5 months ago

Hi @walkadda -

There needs to be change at few lines in handwritten recognition code.

predictions_decoded = keras.ops.nn.ctc_decode(predictions, sequence_lengths=input_len)[0][0][:, :max_len]

prediction_model = keras.models.Model(model.get_layer(name="image").output, model.get_layer(name="dense2").output)

 results = keras.ops.nn.ctc_decode(pred, sequence_lengths=input_len)[0][0][:, :max_len] 

Here is the gist attached for the reference where you find the changes.

github-actions[bot] commented 4 months ago

This issue is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] commented 4 months ago

This issue was closed because it has been inactive for 28 days. Please reopen if you'd like to work on this further.

google-ml-butler[bot] commented 4 months ago

Are you satisfied with the resolution of your issue? Yes No