Closed RamoramaInteractive closed 2 years ago
I looked into the YAML of the term_constraint_model folder. It's strange, that it applied the prepared data from the baseline-Model, which is not supposed to do source-factoring:
allow_missing_params: false
amp: false
apex_amp: false
batch_sentences_multiple_of: 8
batch_size: 560
batch_type: word
bucket_scaling: false
bucket_width: 8
cache_last_best_params: 1
cache_metric: perplexity
cache_strategy: best
checkpoint_improvement_threshold: 0.0
checkpoint_interval: 4000
config: null
decode_and_evaluate: 500
decode_and_evaluate_device_id: null
decoder: transformer
device_id: 0
device_ids:
- -1
disable_device_locking: false
dist: false
dry_run: false
dtype: float32
embed_dropout:
- 0.0
- 0.0
encoder: transformer
env: null
fixed_param_names: []
fixed_param_strategy: null
gradient_clipping_threshold: 1.0
gradient_clipping_type: none
horovod: false
ignore_extra_params: false
initial_learning_rate: 0.0002
keep_initializations: false
keep_last_params: 1
kvstore: device
label_smoothing: 0.1
label_smoothing_impl: mxnet
learning_rate_reduce_factor: 0.9
learning_rate_reduce_num_not_improved: 8
learning_rate_scheduler_type: plateau-reduce
learning_rate_t_scale: 1.0
learning_rate_warmup: 0
length_task: null
length_task_layers: 1
length_task_weight: 1.0
lhuc: null
lock_dir: /tmp
loglevel: INFO
loglevel_secondary_workers: INFO
loss: cross-entropy
max_checkpoints: null
max_num_checkpoint_not_improved: null
max_num_epochs: 100
max_samples: null
max_seconds: null
max_seq_len:
- 101
- 101
max_updates: null
min_num_epochs: 50
min_samples: null
min_updates: null
momentum: 0.0
no_bucketing: false
no_hybridization: false
no_logfile: false
num_embed:
- 512
- 512
num_layers:
- 2
- 2
num_words:
- 32302
- 32302
optimized_metric: perplexity
optimizer: adam
optimizer_betas:
- 0.9
- 0.999
optimizer_eps: 1.0e-08
optimizer_params: null
output: term_constraint_model
overwrite_output: true
pad_vocab_to_multiple_of: 8
params: null
prepared_data: baseline_sockeye
quiet: false
quiet_secondary_workers: false
seed: 1
shared_vocab: true
source: null
source_factor_vocabs: []
source_factors: []
source_factors_combine: []
source_factors_num_embed:
- 1
source_factors_share_embedding: []
source_factors_use_source_vocab: []
source_vocab: null
stop_training_on_decoder_failure: false
target: null
target_factor_vocabs: []
target_factors: []
target_factors_combine: []
target_factors_num_embed:
- 1
target_factors_share_embedding: []
target_factors_use_target_vocab: []
target_factors_weight:
- 1.0
target_vocab: null
transformer_activation_type:
- relu
- relu
transformer_attention_heads:
- 8
- 8
transformer_dropout_act:
- 0.1
- 0.1
transformer_dropout_attention:
- 0.1
- 0.1
transformer_dropout_prepost:
- 0.1
- 0.1
transformer_feed_forward_num_hidden:
- 2048
- 2048
transformer_feed_forward_use_glu: false
transformer_model_size:
- 512
- 512
transformer_positional_embedding_type: fixed
transformer_postprocess:
- dr
- dr
transformer_preprocess:
- n
- n
update_interval: 1
use_cpu: false
validation_source: examples/translation/wmt17_en_de/valid.en
validation_source_factors: []
validation_target: examples/translation/wmt17_en_de/valid.de
validation_target_factors: []
weight_decay: 0.0
weight_init: xavier
weight_init_scale: 3.0
weight_init_xavier_factor_type: avg
weight_init_xavier_rand_type: uniform
weight_tying_type: src_trg_softmax
word_min_count:
- 1
- 1
I've trained the baseline and the source-factoring model at the same time.
I've trained a Sockeye model with Source-Factoring. After I tried to train my model with the sf-file
I received as output
What did I wrong and why can't I apply --input-factors?
I trained the model with this command:
This is how the Sockeye-Prepared looked like:
The Args-YAML of prepared says: