Error when trying to start training

bertanunez commented 6 years ago

When I run the configuration file I get this error and i don't know how to solve it. Could you help me?

Traceback (most recent call last): File "/nmtpytorch/pytorch/bin/nmtpy", line 6, in <mo\ dule> exec(compile(open(file).read(), file, 'exec')) File "/nmtpytorch/bin/nmtpy", line 120, in model = getattr(models, opts.train['model_type'])(opts=opts, logger=log) File "/nmtpytorch/nmtpytorch/models/nmt.py", line 44\ , in init self.vocabs[lang] = Vocabulary(opts.vocabulary[lang]) File "nmtpytorch/nmtpytorch/vocabulary.py", line 29\ , in init self._map = json.load(open(self.vocab)) File "/usr/lib/python3.5/json/init.py", line 268, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File "/usr/lib/python3.5/json/init.py", line 319, in loads return _default_decoder.decode(s) File "/usr/lib/python3.5/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib/python3.5/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

ozancaglayan commented 6 years ago

Hello,

Which configuration did you use? Can you send the [vocabulary] section of it?

bertanunez commented 6 years ago

[vocabulary] en: ~/nmtpytorch/dataset-master/scripts/subword-nmt/vocab.en fr: ~/nmtpytorch/dataset-master/scripts/subword-nmt/vocab.fr

vocab.en and vocab.fr are the vocabulary files created with learn_joint_bpe_and_vocab.py from the subword-nmt repository.

ozancaglayan commented 6 years ago

Once you apply subword to your files, you need to use nmtpy-build-vocab to create nmtpytorch specific vocab ones. So the vocabulary files in our case are the ones produced with nmtpy-build-vocab not the subword vocab files.

bertanunez commented 6 years ago

Thank you Ozan!

Now I have another problem using nmtpy-build-vocab. Which are the files (sentene files) arguments?

ozancaglayan commented 6 years ago

Those are the training data files for which you want to generate the vocabulary files.

Assuming you applied BPE over your training data and now you have 2 files: train.bpe.en and train.bpe.tr.

Calling nmtpy-build-vocab train.bpe.en train.bpe.de will produce 2 vocabulary files (in .json format) that you should give to the NMT model in the configuration file.

bertanunez commented 6 years ago

If I get an error related to Cuda means that I have installed wrong something? I get this error:

File "nmtpytorch/pytorch/lib/python3.5/site-pa\ ckages/torch/cuda/init.py", line 55, in _check_driver raise AssertionError("Torch not compiled with CUDA enabled") AssertionError: Torch not compiled with CUDA enabled

ozancaglayan commented 6 years ago

yes probably you have a problem in your installation :/

Why do you have a pytorch folder inside nmtpytorch?

I suggest you to install Anaconda with Python 3.6 and then get the pip package of pytorch from pytorch.org which contains the required CUDA libraries inside. Make sure to pick a correct CUDA version. This depends on the nvidia driver installed in your system.

lium-lst / nmtpytorch

Error when trying to start training #7