Open zdemillard opened 5 years ago
It does support GRU but not the "dense" bridge. You should use the "last" bridge from this branch instead:
Thank you for the response. Does the "last" bridge support a different number of encoder/decoder layers (e.g. 2-layer encoder and 1-layer decoder)? I know "copy" doesn't work when the number of layers are different.
Does the "last" bridge support a different number of encoder/decoder layers (e.g. 2-layer encoder and 1-layer decoder)?
Yes. If the decoder has N layers, it will only copy the last N layers of the encoder (assuming the encodre has more than N layers).
Great, we will use that then. Thank you!
Hello, we have trained a bidirectional rnn encoder decoder (default OpenNMT-lua settings) and successfully released the model and tested using this repository. However, when we are working through the paper (http://aclweb.org/anthology/W18-2715) and try to replicate the
distill-tiny
model with a GRU encoder with 2-layers on the encoder but only 1-layer on the decoder, we run into the issue that the released model doesn't translate anything using the GPU (--cuda
). When I run on the CPU, I get the following error:The model can accurately translate using the lua code so we know it isn't any issue with the model but must be something incompatible when we try to release to CTranslate. Here is the full configuration used to train:
Does CTranslate support
GRU
as arnn_type
and does it supportdense
as an option for-bridge
?