got ValueError when training transformer_ae model using translate_ende_wmt32k problem

baoalita commented 6 years ago

I'd like to train a transformer_ae model using translate_ende_wmt32k problem. Parts of my commands is copied as below.

PROBLEM=translate_ende_wmt32k
MODEL=transformer_ae
HPARAMS=transformer_ae_base_small

t2t-datagen --problem=$PROBLEM
t2t-trainer --problem=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS

And then I got ValueError:

ValueError: Dimension 3 in both shapes must be equal, but are 4096 and 384. Shapes are [?,?,1,4096] and [?,?,1,384]. for 'transformer_ae/parallel_0_5/transformer_ae/transformer_ae/body/body/Select' (op: 'Select') with input shapes: [?], [?,?,1,4096], [?,?,1,384].

My local environment is:

tensor2tensor            1.7.0
tensorflow               1.10.0
python                   1.10.0

Is there anything wrong with my command?

shezxc6540 commented 5 years ago

Have u solved this problem? I met the same problem, but I thought the problem is that transfromer_ae model is designed for image_generation problem, because it claims that "We suggest to use the Image Transformer, i.e., --model=imagetransformer, or the Image Transformer Plus, i.e., --model=imagetransformerpp that uses discretized mixture of logistics, or variational auto-encoder, i.e., --model=transformer_ae. "

shezxc6540 commented 5 years ago

The problem is compress_filter_size should be the same as the hidden_size, after that, it works.

shezxc6540 commented 5 years ago

but what is the detailed reason..

tensorflow / tensor2tensor

got ValueError when training transformer_ae model using translate_ende_wmt32k problem #1007