Closed HURIMOZ closed 1 month ago
This is not an error. The yaml config you are using is share_decoder_embeddings: true
hence the model does not store twice the weights. We'll make this clearer and avoid the message in the log when this flag is ON.
Oh I see. Yes, makes sense now. Switched to false. Thanks Vince!
don't switch to false it's very fine to share embeddings between 1) src and tgt and 2) decoder and generator. not sharing does not bring improvement
Hi, Iʻm using recipe wmt17 to build a bilingual translation model. Iʻm now able to train the models but on inference I get this error:
The rest of the inference seems to run fine:
In my models repository, three files are generated for every step saved:
This is my bash command for inference:
eole predict --src processed_data/test.src.bpe --model_path models/step_7000 --beam_size 5 --batch_size 2048 --batch_type tokens --output translations/test.trg.bpe --gpu 0
What am I doing wrong to get error "Missing key in safetensors checkpoint: generator.weight"?