Open bensussman opened 6 years ago
The charCNN models were trained with seq2seq-attn, please see the instructions there. Once a model has been trained, you can use this script to inspect the weights. The meanChar models were trained with our modified version of that code, so let me know if you require that as well.
Thanks @boknilev . I wasn't sure what the entry point was.
Given your advice I downloaded the pre-trained models here: https://drive.google.com/file/d/0BzhmYioWLRn_aEVnd0ZNcWd0Y2c/view and ran:
th dump_charcnn_weights.lua -model trained_models/en-to-de-model.t7
On my CUDA installed GPU machine. It ran for just under 30 minutes and logged nothing to stdout or stderr, wrote nothing to disk, and exited with exit code 1. Presumably that is not the expected behavior. Any advice to debug?
I think these models don't have the charCNN part. You'll need to train a model with the character options from seq2seq-attn: -use_chars_enc 1
and -cudnn 1
(cudnn is not a must, but dump_charcnn_weights.lua
expects it).
We could possibly make some pre-trained models available if people find that useful. @ybisk , what do you think?
I am trying to recreate your results from Synthetic and Natural Noise Both Break Neural Machine Translation and run the model on my systems.
Is there any documentation about how to run the model, and what setup is needed? I am running it on ubuntu16.04 with CUDA and torch already installed (using the 9.0-devel-ubuntu16.04 docker image found here https://hub.docker.com/r/nvidia/cuda/tags/)
So far, some things I've found to be required:
The only main method is in extract_states.lua, but running
th extract_states.lua
breaks on the linerequire 'models.lua'
withThanks in advance for your help!