lena-voita / good-translation-wrong-in-context

This is a repository with the data and code for the ACL 2019 paper "When a Good Translation is Wrong in Context: ..." and the EMNLP 2019 paper "Context-Aware Monolingual Repair for Neural Machine Translation"
96 stars 18 forks source link

the translation results along with training and scripts in notebooks are different #12

Open annisamansa opened 3 years ago

annisamansa commented 3 years ago

Dear authors, I wrote a script according to Load_model_and_translate_baseline.ipynb to translate some test file. But I found that the translation results are so different from results that decoded during training. What's more, the translation results are quite far apart from source text. And I'm sure the source text is the same with dev set. How can I make them the same............. Below is my script...

import sys import pickle import numpy as np import tensorflow as tf

REPO_PATH = 'xxxxxxxxxxxxx/good-translation-wrong-in-context/' sys.path.insert(0, REPO_PATH) print(sys.path) import lib import lib.task.seq2seq.models.transformer as tr VOC_PATH = REPO_PATH + '/scripts/build/'

def cli_main(): inp_voc = pickle.load(open(VOC_PATH + 'src.voc', 'rb')) out_voc = pickle.load(open(VOC_PATH + 'dst.voc', 'rb')) testid = inp_voc.ids(['BOS', 'EOS', 'UNK']) print(testid)

tf.reset_default_graph()
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.99, allow_growth=True)
sess = tf.InteractiveSession(config=tf.ConfigProto(gpu_options=gpu_options))

hp = {
    # the same with training script
    }
model = tr.Model('mod', inp_voc, out_voc, inference_mode='fast', **hp)

path_to_ckpt = REPO_PATH + '/scripts/build/checkpoint/model-2048.npz'
var_list = tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES)
lib.train.saveload.load(path_to_ckpt, var_list)

path_to_testset = REPO_PATH + '/scripts/e2c/test_src'
test_src = open(path_to_testset).readlines()
print(test_src)

model.translate_lines(test_src)

if name == 'main': cli_main()