tensorflow / tensor2tensor

Library of deep learning models and datasets designed to make deep learning more accessible and accelerate ML research.
Apache License 2.0
15.52k stars 3.5k forks source link

got ValueError when training transformer_ae model using translate_ende_wmt32k problem #1007

Open baoalita opened 6 years ago

baoalita commented 6 years ago

I'd like to train a transformer_ae model using translate_ende_wmt32k problem. Parts of my commands is copied as below.

PROBLEM=translate_ende_wmt32k
MODEL=transformer_ae
HPARAMS=transformer_ae_base_small

t2t-datagen --problem=$PROBLEM
t2t-trainer --problem=$PROBLEM --model=$MODEL --hparams_set=$HPARAMS

And then I got ValueError:

ValueError: Dimension 3 in both shapes must be equal, but are 4096 and 384. Shapes are [?,?,1,4096] and [?,?,1,384]. for 'transformer_ae/parallel_0_5/transformer_ae/transformer_ae/body/body/Select' (op: 'Select') with input shapes: [?], [?,?,1,4096], [?,?,1,384].

My local environment is:

tensor2tensor            1.7.0
tensorflow               1.10.0
python                   1.10.0

Is there anything wrong with my command?

shezxc6540 commented 5 years ago

Have u solved this problem? I met the same problem, but I thought the problem is that transfromer_ae model is designed for image_generation problem, because it claims that "We suggest to use the Image Transformer, i.e., --model=imagetransformer, or the Image Transformer Plus, i.e., --model=imagetransformerpp that uses discretized mixture of logistics, or variational auto-encoder, i.e., --model=transformer_ae. "

shezxc6540 commented 5 years ago

The problem is compress_filter_size should be the same as the hidden_size, after that, it works.

shezxc6540 commented 5 years ago

but what is the detailed reason..