Closed pprzyby2 closed 6 years ago
Is this as model that has been trained with Nematus? You would need to provide the network structure on the command line. Although I am confused as I see there is a "special:model.yml.npy" in the model. Did you try to convert the model with a script?
So, this would be the options that marian support:
--dec-cell-base-depth 4 --dec-cell-high-depth 2 --dec-depth 4 --enc-depth 4 --enc-cell-depth 2
Not sure about these two:
"dec_deep_context": true,
"enc_depth_bidirectional": 4,
I do not think we have that at the moment. Any idea what they do?
Also any particular reason for this architecture? It does not correspond to any of the recent Edinburgh WMT papers.
The model was trained with Nematus. The architecture is as follows: --dim_word 512 \ --dim 1024 \ --tie_decoder_embeddings \ --layer_normalisation \ --enc_depth 4 \ --dec_depth 4 \ --dec_deep_context \ --enc_recurrence_transition_depth 2 \ --dec_base_recurrence_transition_depth 4 \ --dec_high_recurrence_transition_depth 2 \
With a quick view at the marian code I didn't noticed labels fgenerator or encoder decoder layers that in Nematus are named encoderN... I'll be digging further.
I am not sure about --dec_deep_context , does that put attention mechanisms into each decoder layer? If yes, we do not have that.
apart from the options above you would need to add
--layer-normalization --tied-embeddings
So the complete set of supported options would be:
--ignore-model-config
--layer-normalization
--tied-embeddings
--dec-cell-base-depth 4
--dec-cell-high-depth 2
--dec-depth 4
--enc-depth 4
--enc-cell-depth 2
The option --ignore-model-config
is a bit risky, as it will fill missing parameters with random weights, but it will at least ignore a potentially incorrect config that was inserted into the npz file.
--dec_deep_context makes the context to be concatenated with the input of successive layers in the decoder. Still, as I'mreading the nematus.h where the mapping is generated e.g for encoder lines 163-172, I can only see the looping over enc-cell-depth. What about enc-depth - it looks like there will be no mapping for encoder_2_b.npy or am I missing something ?
OK, we do not have that. This is a repetition of the aligned context, I suppose, from the first attention mechanism. So this model is currently not compatible with Marian.
If you can make that model available together with some testing data we can take a look. No guarantee about the time frame though.
Closing this for now.
We tried running Marian decoder for bi-deep models trained with Nematus with following command:
For one layer models Marian worked fine but for multi-layer model we had an error:
Here are some features of the model from model.npz.json file:
Actual file names from model.npz: