ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.09k stars 1.19k forks source link

Text decoder broken for machine translation example #1097

Closed tgaddair closed 3 years ago

tgaddair commented 3 years ago

Config:

input_features:
    -
        name: english
        type: text
        level: word
        encoder: rnn
        cell_type: lstm
        reduce_output: null
        preprocessing:
          word_tokenizer: english_tokenize

output_features:
    -
        name: italian
        type: text
        level: word
        decoder: generator
        cell_type: gru
        attention: bahdanau
        loss:
            type: sampled_softmax_cross_entropy
        preprocessing:
          word_tokenizer: italian_tokenize

training:
    batch_size: 96

Error (eager execution):

Traceback (most recent call last):
  File "/Users/tgaddair/.venv/ludwig/py38/bin/ludwig", line 11, in <module>
    load_entry_point('ludwig', 'console_scripts', 'ludwig')()
  File "/Users/tgaddair/repos/ludwig/ludwig/cli.py", line 146, in main
    CLI()
  File "/Users/tgaddair/repos/ludwig/ludwig/cli.py", line 72, in __init__
    getattr(self, args.command)()
  File "/Users/tgaddair/repos/ludwig/ludwig/cli.py", line 92, in experiment
    experiment.cli(sys.argv[2:])
  File "/Users/tgaddair/repos/ludwig/ludwig/experiment.py", line 571, in cli
    experiment_cli(**vars(args))
  File "/Users/tgaddair/repos/ludwig/ludwig/experiment.py", line 216, in experiment_cli
    ) = model.experiment(
  File "/Users/tgaddair/repos/ludwig/ludwig/api.py", line 1029, in experiment
    ) = self.train(
  File "/Users/tgaddair/repos/ludwig/ludwig/api.py", line 475, in train
    train_stats = trainer.train(
  File "/Users/tgaddair/repos/ludwig/ludwig/models/trainer.py", line 601, in train
    loss, all_losses = model.train_step(
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/eager/def_function.py", line 814, in __call__
    return self._python_function(*args, **kwds)
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/eager/function.py", line 3936, in bound_method_wrapper
    return wrapped_fn(weak_instance(), *args, **kwargs)
  File "/Users/tgaddair/repos/ludwig/ludwig/models/ecd.py", line 183, in train_step
    model_outputs = self((inputs, targets), training=True)
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1032, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/Users/tgaddair/repos/ludwig/ludwig/models/ecd.py", line 125, in call
    decoder_outputs = decoder(
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1032, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/Users/tgaddair/repos/ludwig/ludwig/features/base_feature.py", line 272, in call
    logits = self.logits(logits_input, target=target, training=training)
  File "/Users/tgaddair/repos/ludwig/ludwig/features/sequence_feature.py", line 241, in logits
    return self.decoder_obj._logits_training(
  File "/Users/tgaddair/repos/ludwig/ludwig/decoders/sequence_decoders.py", line 164, in _logits_training
    logits = self.decoder_teacher_forcing(
  File "/Users/tgaddair/repos/ludwig/ludwig/decoders/sequence_decoders.py", line 321, in decoder_teacher_forcing
    encoder_sequence_length = sequence_length_3D(encoder_output)
  File "/Users/tgaddair/repos/ludwig/ludwig/utils/tf_utils.py", line 28, in sequence_length_3D
    print(tf.reduce_max(tf.abs(sequence), 2))
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 206, in wrapper
    return target(*args, **kwargs)
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 2799, in reduce_max
    return reduce_max_with_dims(input_tensor, axis, keepdims, name,
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/ops/math_ops.py", line 2811, in reduce_max_with_dims
    gen_math_ops._max(input_tensor, dims, keepdims, name=name))
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/ops/gen_math_ops.py", line 5786, in _max
    _ops.raise_from_not_ok_status(e, name)
  File "/Users/tgaddair/.venv/ludwig/py38/lib/python3.8/site-packages/tensorflow/python/framework/ops.py", line 6873, in raise_from_not_ok_status
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid reduction dimension (2 for input with 2 dimension(s) [Op:Max]

Workaround is adding reduce_input: null to output feature config.

jimthompson5802 commented 3 years ago

@tgaddair I talked with @w4nderlust It looks like the machine translation example in the ludwig-docs is missing the reduce_input parameter. I busted a PR to update the example documentation.

w4nderlust commented 3 years ago

This has been fixed in https://github.com/ludwig-ai/ludwig/pull/1103