Issue with ensemble scoring method

lvapeab / nmt-keras

Neural Machine Translation with Keras

http://nmt-keras.readthedocs.io

MIT License

533 stars 130 forks source link

Issue with ensemble scoring method #134

Closed NamTran838P closed 4 years ago

NamTran838P commented 4 years ago

I am using the function score_corpus in apply_model.py and encountering the following issue:

Capture

My code is as follows:

def main():
    params = load_parameters()
    args = argparse.Namespace()
    args.dataset = "dataset/Dataset_tutorial_dataset.pkl"
    args.source = "validation_error.txt"
    args.target = "validation_correct.txt"
    args.dest = "ensemble_score.txt"
    args.weights = [0.33, 0.33, 0.33]
    args.verbose = True
    args.n_best = 3
    args.splits = ['test']
    args.models = ["ensemble_models/lstm_128", "ensemble_models/lstm_256", "ensemble_models/lstm_1024"]
    score_corpus(args, params)

It appears that the problem is in the scoreNet function in multimodal_keras_wrapper. Any help is really appreciated! Thanks

lvapeab commented 4 years ago

I've been unable to reproduce your error. I have pushed a small fix for an edge case. Try again updating the repo.

If it doesn't work, can you set args.splits = ['val'] and try again?

If it still doesn't work, can you provide more info about the dataset? For instance:

from keras_wrapper.saving import loadDataset
ds = loadDataset('dataset/Dataset_tutorial_dataset.pkl')
print("ids_inputs:", ds.ids_inputs)
print("ids_outputs:", ds.ids_outputs)
print("types_inputs:", ds.types_inputs)
print("types_outputs:", ds.types_outputs)
print("len_train:", ds.len_train)
print("len_val:", ds.len_val)
print("len_test:", ds.len_test)

NamTran838P commented 4 years ago

I just updated everything and tried setting args.splits = ['val']. The following error showed up:

Here is more information about the dataset:

Thanks a lot for your help.

lvapeab commented 4 years ago

I think I was able to reproduce your problem. I think you are updating a dataset of type text with data of type text_features. This has been fixed (https://github.com/MarcBS/multimodal_keras_wrapper/commit/1349edaaa0e13092a72280bb24316b460ed841de).

I've also modified a little bit the update_dataset_from_file. So, please try again updating multimodal_keras_wrapper and nmt_keras.