Open Esaada opened 6 years ago
I think transformer_moe use quite different code base from transformer model. If you use hyper-parameters from transformer model code base, it will not contain some mandatory hyper-parameters needed in order to run transformer_moe.
As I read in source code, you will need to at least add some (unused) hyper-parameters like this:
hparams = transformer.transformer_base_single_gpu()
# Params below are required in order to have transformer_moe perform the same way as transformer
hparams.layer_types = "a/a/a/a/a#a/a/a/a/a"
hparams.default_att = "a"
hparams.default_ff = "fc"
# Params below may not be used, but need to be exist
hparams.attention_loc_block_length = 256
hparams.attention_loc_block_width = 128
hparams.attention_red_factor = 3
hparams.attention_red_type = "conv"
hparams.attention_red_nonlinearity = "none"
Anyway, if you mean to use transformer_moe, then you probably should use hyper-parameters from transformer_moe, such as: transformer_moe_2k
It seems that the fc layer of the moe type has not been implemented when i use hyper-parameters from transformer_moe, such as: transformer_moe_2k,
with following architecture:
* No encoder.
* Layer 0: a - sep (self-attention - unmasked separable convolutions)
* Layer 1: a - sep
* Layer 2: a - sep
* Layer 3: a - sep
* Layer 4: a - sep
* Decoder architecture:
* Layer 0: a - a - sepm (self-attention - enco/deco-attention - masked sep)
* Layer 1: a - a - sepm
* Layer 2: a - a - moe (mixture of expert layers in the middle)
* Layer 3: a - a - sepm
* Layer 4: a - a - sepm
I get :
KeyError: "in converted code:\n relative to E:\\workspace\\nmt-train\\tensor2tensor:\n\n utils\\t2t_model.py:326 call\n sharded_logits, losses = self.model_fn_sharded(sharded_features)\n utils\\t2t_model.py:374 model_fn_sharded\n self._to_single_features_dict(transformed_features))\n models\\research\\transformer_moe.py:172 body_sharded\n x = prepostprocess(layers[ff_type])(\n\n KeyError: 'moe'\n"
Description
following the instruction and got this error: AttributeError: 'HParams' object has no attribute 'layer_types'
Environment information
Steps to reproduce:
I used those parameters and actions: PROBLEM=librispeech MODEL=transformer_moe HPARAMS=transformer_base_single_gpu DATA_DIR=./t2t_data TMP_DIR=/tmp/t2t_datagen TRAIN_DIR=./t2t_train/$PROBLEM/$MODEL-$HPARAMS
mkdir -p $DATA_DIR $TMP_DIR $TRAIN_DIR
t2t-datagen \ --data_dir=$DATA_DIR \ --tmp_dir=$TMP_DIR \ --problem=$PROBLEM
In the end, I used the "train" command: t2t-trainer \ --data_dir=$DATA_DIR \ --problem=$PROBLEM \ --model=$MODEL \ --hparams_set=$HPARAMS \ --output_dir=$TRAIN_DIR
Error logs:
WARNING:tensorflow:Shapes are not fully defined. Assuming batch_size means tokens. INFO:tensorflow:Calling model_fn. INFO:tensorflow:Unsetting shared_embedding_and_softmax_weights. INFO:tensorflow:Setting T2TModel mode to 'train' INFO:tensorflow:Using variable initializer: uniform_unit_scaling INFO:tensorflow:Transforming feature 'inputs' with speech_recognition_modality.bottom INFO:tensorflow:Transforming 'targets' with symbol_modality_256_512.targets_bottom WARNING:tensorflow:From /usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/function.py:986: calling create_op (from tensorflow.python.framework.ops) with compute_shapes is deprecated and will be removed in a future version. Instructions for updating: Shapes are always computed; don't use the compute_shapes as it has no effect. Traceback (most recent call last): File "/usr/local/bin/t2t-trainer", line 32, in
tf.app.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/platform/app.py", line 125, in run
_sys.exit(main(argv))
File "/usr/local/bin/t2t-trainer", line 28, in main
t2t_trainer.main(argv)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/bin/t2t_trainer.py", line 385, in main
execute_schedule(exp)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/bin/t2t_trainer.py", line 326, in execute_schedule
getattr(exp, FLAGS.schedule)()
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/trainer_lib.py", line 331, in continuous_train_and_eval
self._eval_spec)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 451, in train_and_evaluate
return executor.run()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 590, in run
return self.run_local()
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/training.py", line 691, in run_local
saving_listeners=saving_listeners)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 376, in train
loss = self._train_model(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1145, in _train_model
return self._train_model_default(input_fn, hooks, saving_listeners)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1170, in _train_model_default
features, labels, model_fn_lib.ModeKeys.TRAIN, self.config)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/estimator/estimator.py", line 1133, in _call_model_fn
model_fn_results = self._model_fn(features=features, kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/t2t_model.py", line 1184, in wrapping_model_fn
decode_hparams=decode_hparams)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/t2t_model.py", line 1236, in estimator_model_fn
logits, losses_dict = model(features) # pylint: disable=not-callable
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/layers/base.py", line 362, in call
outputs = super(Layer, self).call(inputs, *args, *kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/keras/engine/base_layer.py", line 736, in call
outputs = self.call(inputs, args, kwargs)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/t2t_model.py", line 190, in call
sharded_logits, losses = self.model_fn_sharded(sharded_features)
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/utils/t2t_model.py", line 216, in model_fn_sharded
self._to_single_features_dict(transformed_features))
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/research/transformer_moe.py", line 103, in body_sharded
encoder_layers, decoder_layers = self._extract_layer_types()
File "/usr/local/lib/python2.7/dist-packages/tensor2tensor/models/research/transformer_moe.py", line 222, in _extract_layer_types
layer_types = hparams.layer_types
AttributeError: 'HParams' object has no attribute 'layer_types'
Thanks!