ybisk / charNMT-noise

Scripts and noise data for Belinkov & Bisk 2018
29 stars 8 forks source link

Help needed running the code #1

Open bensussman opened 6 years ago

bensussman commented 6 years ago

I am trying to recreate your results from Synthetic and Natural Noise Both Break Neural Machine Translation and run the model on my systems.

Is there any documentation about how to run the model, and what setup is needed? I am running it on ubuntu16.04 with CUDA and torch already installed (using the 9.0-devel-ubuntu16.04 docker image found here https://hub.docker.com/r/nvidia/cuda/tags/)

So far, some things I've found to be required:

The only main method is in extract_states.lua, but running th extract_states.lua breaks on the line require 'models.lua' with

/root/distro/install/bin/luajit: /root/distro/install/share/lua/5.1/trepl/init.lua:389: module 'models.lua' not found:No LuaRocks module found for models.lua
    no field package.preload['models.lua']
    no file '/root/.luarocks/share/lua/5.1/models/lua.lua'

Thanks in advance for your help!

boknilev commented 6 years ago

The charCNN models were trained with seq2seq-attn, please see the instructions there. Once a model has been trained, you can use this script to inspect the weights. The meanChar models were trained with our modified version of that code, so let me know if you require that as well.

bensussman commented 6 years ago

Thanks @boknilev . I wasn't sure what the entry point was.

Given your advice I downloaded the pre-trained models here: https://drive.google.com/file/d/0BzhmYioWLRn_aEVnd0ZNcWd0Y2c/view and ran:

th dump_charcnn_weights.lua -model trained_models/en-to-de-model.t7

On my CUDA installed GPU machine. It ran for just under 30 minutes and logged nothing to stdout or stderr, wrote nothing to disk, and exited with exit code 1. Presumably that is not the expected behavior. Any advice to debug?

boknilev commented 6 years ago

I think these models don't have the charCNN part. You'll need to train a model with the character options from seq2seq-attn: -use_chars_enc 1 and -cudnn 1 (cudnn is not a must, but dump_charcnn_weights.lua expects it).

boknilev commented 6 years ago

We could possibly make some pre-trained models available if people find that useful. @ybisk , what do you think?