marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.25k stars 233 forks source link

Transformer from 1.6.0 asking for more 3 inputs instead of 2. #205

Closed alvations closed 6 years ago

alvations commented 6 years ago

Using 1.6.0 (bda9b18b6cede63b0476e9c144da3f62f03515b1), with --type transformer, it's looking for 3 inputs instead of 2.

With https://gist.github.com/emjotde/9c5260870c25304b9c8b111ddcf81b74

In the log:

+ /home/ltan/marian//build/marian --model /home/ltan/ibot-train/model-transformer.en-de.tokenized.truecased.bped.100000.5000.0//model.npz --type transformer --train-sets data/en-de/train.en.tokenized.truecased.bped data/en-de/train.de.tokenized.truecased.bped ' '
[2018-08-21 11:20:27] [config] after-batches: 0
[2018-08-21 11:20:27] [config] after-epochs: 0
[2018-08-21 11:20:27] [config] allow-unk: false
[2018-08-21 11:20:27] [config] batch-flexible-lr: false
[2018-08-21 11:20:27] [config] batch-normal-words: 1920
[2018-08-21 11:20:27] [config] beam-size: 12
[2018-08-21 11:20:27] [config] best-deep: false
[2018-08-21 11:20:27] [config] clip-gemm: 0
[2018-08-21 11:20:27] [config] clip-norm: 1
[2018-08-21 11:20:27] [config] cost-type: ce-mean
[2018-08-21 11:20:27] [config] cpu-threads: 0
[2018-08-21 11:20:27] [config] data-weighting-type: sentence
[2018-08-21 11:20:27] [config] dec-cell: gru
[2018-08-21 11:20:27] [config] dec-cell-base-depth: 2
[2018-08-21 11:20:27] [config] dec-cell-high-depth: 1
[2018-08-21 11:20:27] [config] dec-depth: 1
[2018-08-21 11:20:27] [config] devices:
[2018-08-21 11:20:27] [config]   - 0
[2018-08-21 11:20:27] [config] dim-emb: 512
[2018-08-21 11:20:27] [config] dim-rnn: 1024
[2018-08-21 11:20:27] [config] dim-vocabs:
[2018-08-21 11:20:27] [config]   - 0
[2018-08-21 11:20:27] [config]   - 0
[2018-08-21 11:20:27] [config] disp-freq: 1000
[2018-08-21 11:20:27] [config] disp-label-counts: false
[2018-08-21 11:20:27] [config] dropout-rnn: 0
[2018-08-21 11:20:27] [config] dropout-src: 0
[2018-08-21 11:20:27] [config] dropout-trg: 0
[2018-08-21 11:20:27] [config] early-stopping: 10
[2018-08-21 11:20:27] [config] embedding-fix-src: false
[2018-08-21 11:20:27] [config] embedding-fix-trg: false
[2018-08-21 11:20:27] [config] embedding-normalization: false
[2018-08-21 11:20:27] [config] enc-cell: gru
[2018-08-21 11:20:27] [config] enc-cell-depth: 1
[2018-08-21 11:20:27] [config] enc-depth: 1
[2018-08-21 11:20:27] [config] enc-type: bidirectional
[2018-08-21 11:20:27] [config] exponential-smoothing: 0
[2018-08-21 11:20:27] [config] grad-dropping-momentum: 0
[2018-08-21 11:20:27] [config] grad-dropping-rate: 0
[2018-08-21 11:20:27] [config] grad-dropping-warmup: 100
[2018-08-21 11:20:27] [config] guided-alignment-cost: ce
[2018-08-21 11:20:27] [config] guided-alignment-weight: 1
[2018-08-21 11:20:27] [config] ignore-model-config: false
[2018-08-21 11:20:27] [config] interpolate-env-vars: false
[2018-08-21 11:20:27] [config] keep-best: false
[2018-08-21 11:20:27] [config] label-smoothing: 0
[2018-08-21 11:20:27] [config] layer-normalization: false
[2018-08-21 11:20:27] [config] learn-rate: 0.0001
[2018-08-21 11:20:27] [config] log-level: info
[2018-08-21 11:20:27] [config] lr-decay: 0
[2018-08-21 11:20:27] [config] lr-decay-freq: 50000
[2018-08-21 11:20:27] [config] lr-decay-inv-sqrt: 0
[2018-08-21 11:20:27] [config] lr-decay-repeat-warmup: false
[2018-08-21 11:20:27] [config] lr-decay-reset-optimizer: false
[2018-08-21 11:20:27] [config] lr-decay-start:
[2018-08-21 11:20:27] [config]   - 10
[2018-08-21 11:20:27] [config]   - 1
[2018-08-21 11:20:27] [config] lr-decay-strategy: epoch+stalled
[2018-08-21 11:20:27] [config] lr-report: false
[2018-08-21 11:20:27] [config] lr-warmup: 0
[2018-08-21 11:20:27] [config] lr-warmup-at-reload: false
[2018-08-21 11:20:27] [config] lr-warmup-cycle: false
[2018-08-21 11:20:27] [config] lr-warmup-start-rate: 0
[2018-08-21 11:20:27] [config] max-length: 50
[2018-08-21 11:20:27] [config] max-length-crop: false
[2018-08-21 11:20:27] [config] max-length-factor: 3
[2018-08-21 11:20:27] [config] maxi-batch: 100
[2018-08-21 11:20:27] [config] maxi-batch-sort: trg
[2018-08-21 11:20:27] [config] mini-batch: 64
[2018-08-21 11:20:27] [config] mini-batch-fit: false
[2018-08-21 11:20:27] [config] mini-batch-fit-step: 10
[2018-08-21 11:20:27] [config] mini-batch-words: 0
[2018-08-21 11:20:27] [config] model: /home/ltan/ibot-train/model-transformer.en-de.tokenized.truecased.bped.100000.5000.0//model.npz
[2018-08-21 11:20:27] [config] multi-node: false
[2018-08-21 11:20:27] [config] multi-node-overlap: true
[2018-08-21 11:20:27] [config] n-best: false
[2018-08-21 11:20:27] [config] no-reload: false
[2018-08-21 11:20:27] [config] no-restore-corpus: false
[2018-08-21 11:20:27] [config] no-shuffle: false
[2018-08-21 11:20:27] [config] normalize: 0
[2018-08-21 11:20:27] [config] optimizer: adam
[2018-08-21 11:20:27] [config] optimizer-delay: 1
[2018-08-21 11:20:27] [config] overwrite: false
[2018-08-21 11:20:27] [config] quiet: false
[2018-08-21 11:20:27] [config] quiet-translation: false
[2018-08-21 11:20:27] [config] relative-paths: false
[2018-08-21 11:20:27] [config] right-left: false
[2018-08-21 11:20:27] [config] save-freq: 10000
[2018-08-21 11:20:27] [config] seed: 0
[2018-08-21 11:20:27] [config] skip: false
[2018-08-21 11:20:27] [config] sqlite: ""
[2018-08-21 11:20:27] [config] sqlite-drop: false
[2018-08-21 11:20:27] [config] sync-sgd: false
[2018-08-21 11:20:27] [config] tempdir: /tmp
[2018-08-21 11:20:27] [config] tied-embeddings: false
[2018-08-21 11:20:27] [config] tied-embeddings-all: false
[2018-08-21 11:20:27] [config] tied-embeddings-src: false
[2018-08-21 11:20:27] [config] train-sets:
[2018-08-21 11:20:27] [config]   - data/en-de/train.en.tokenized.truecased.bped
[2018-08-21 11:20:27] [config]   - data/en-de/train.de.tokenized.truecased.bped
[2018-08-21 11:20:27] [config]   - " "
[2018-08-21 11:20:27] [config] transformer-aan-activation: swish
[2018-08-21 11:20:27] [config] transformer-aan-depth: 2
[2018-08-21 11:20:27] [config] transformer-aan-nogate: false
[2018-08-21 11:20:27] [config] transformer-decoder-autoreg: self-attention
[2018-08-21 11:20:27] [config] transformer-dim-aan: 2048
[2018-08-21 11:20:27] [config] transformer-dim-ffn: 2048
[2018-08-21 11:20:27] [config] transformer-dropout: 0
[2018-08-21 11:20:27] [config] transformer-dropout-attention: 0
[2018-08-21 11:20:27] [config] transformer-dropout-ffn: 0
[2018-08-21 11:20:27] [config] transformer-ffn-activation: swish
[2018-08-21 11:20:27] [config] transformer-ffn-depth: 2
[2018-08-21 11:20:27] [config] transformer-heads: 8
[2018-08-21 11:20:27] [config] transformer-no-projection: false
[2018-08-21 11:20:27] [config] transformer-postprocess: dan
[2018-08-21 11:20:27] [config] transformer-postprocess-emb: d
[2018-08-21 11:20:27] [config] transformer-preprocess: ""
[2018-08-21 11:20:27] [config] transformer-tied-layers:
[2018-08-21 11:20:27] [config]   []
[2018-08-21 11:20:27] [config] type: transformer
[2018-08-21 11:20:27] [config] valid-freq: 10000
[2018-08-21 11:20:27] [config] valid-max-length: 1000
[2018-08-21 11:20:27] [config] valid-metrics:
[2018-08-21 11:20:27] [config]   - cross-entropy
[2018-08-21 11:20:27] [config] valid-mini-batch: 32
[2018-08-21 11:20:27] [config] word-penalty: 0
[2018-08-21 11:20:27] [config] workspace: 2048
[2018-08-21 11:20:27] [data] Loading vocabulary from JSON/Yaml file data/en-de/train.en.tokenized.truecased.bped.yml
[2018-08-21 11:20:27] [data] Setting vocabulary size for input 0 to 80808
[2018-08-21 11:20:27] [data] Loading vocabulary from JSON/Yaml file data/en-de/train.de.tokenized.truecased.bped.yml
[2018-08-21 11:20:28] [data] Setting vocabulary size for input 1 to 85868
[2018-08-21 11:20:28] [data] Creating vocabulary  .yml from  
[2018-08-21 11:20:28] File ' ' does not exist
Aborted from InputFileStream::InputFileStream(const string&) in /home/ltan/marian/src/marian/src/common/file_stream.h: 94
train-ibot-transformer.sh: line 63:   371 Aborted                 (core dumped) $MARIAN --model $MODELDIR/model.npz --type transformer --train-sets $TRAIN_SRC $TRAIN_TRG \ 
+ --vocab /home/ltan/ibot-train/model-transformer.en-de.tokenized.truecased.bped.100000.5000.0//vocab.src.yml /home/ltan/ibot-train/model-transformer.en-de.tokenized.truecased.bped.100000.5000.0//vocab.trg.yml --mini-batch-fit -w 10 --mini-batch 1000 --maxi-batch 1000 --valid-freq 5000 --save-freq 5000 --disp-freq 500 --valid-metrics ce-mean-words perplexity translation --valid-sets data/en-de/valid.en.tokenized.truecased.bped data/en-de/valid.de.tokenized.truecased.bped --valid-script-path validate.sh --valid-translation-output /home/ltan/ibot-train/model-transformer.en-de.tokenized.truecased.bped.100000.5000.0//valid.bpe.en.output --quiet-translation --beam-size 6 --normalize=0.6 --valid-mini-batch 16 --overwrite --keep-best --early-stopping 5 --cost-type=ce-mean-words --log /home/ltan/ibot-train/model-transformer.en-de.tokenized.truecased.bped.100000.5000.0//train.log --valid-log /home/ltan/ibot-train/model-transformer.en-de.tokenized.truecased.bped.100000.5000.0//valid.log --enc-depth 6 --dec-depth 6 --transformer-preprocess n --transformer-postprocess da --tied-embeddings-all --dim-emb 1024 --transformer-dim-ffn 4096 --transformer-dropout 0.1 --transformer-dropout-attention 0.1 --transformer-dropout-ffn 0.1 --label-smoothing 0.1 --learn-rate 0.0001 --lr-warmup 8000 --lr-decay-inv-sqrt 8000 --lr-report --optimizer-params 0.9 0.98 1e-09 --clip-norm 5 --devices 0 1 2 3 --optimizer-delay 2 --sync-sgd --seed --exponential-smoothing
train-ibot-transformer.sh: line 64: --vocab: command not found
alvations commented 6 years ago

Found the issue, sort of typo (on my part)

This works:

$MARIAN/marian \
    --model $MODEL_DIR/model.npz --type transformer \
    --train-sets $DATA_DIR/data/all.paracrawl.8M.bpe.en $DATA_DIR/data/all.paracrawl.8M.bpe.de \
    --max-length 100 \

But when there's an extra space after $DATA_DIR/data/all.paracrawl.8M.bpe.de \, it looks for a 3rd input in the --train-sets (shell quirks).

$MARIAN/marian \
    --model $MODEL_DIR/model.npz --type transformer \
    --train-sets $DATA_DIR/data/all.paracrawl.8M.bpe.en $DATA_DIR/data/all.paracrawl.8M.bpe.de \ 
    --max-length 100 \