Error while training with the following command python main.py

frajos100 commented 6 years ago

Error while training with the following command python main.py.

Finally training hangs with the following error. How do we resolve the error?

[21/05/2018 14:41:05] Evaluating on metric coco Traceback (most recent call last): File "main.py", line 444, in train_model(parameters, args.dataset) File "main.py", line 167, in train_model nmt_model.trainNet(dataset, training_params) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\cnn_model.py", line 829, in trainNet self.train(ds, params) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\cnn_model.py", line 1088, in train initial_epoch=params['epoch_offset']) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras\keras\legacy\interfaces.py", line 91, in wrapper return func(*args, **kwargs) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras\keras\engine\training.py", line 1260, in fit_generator initial_epoch=initial_epoch) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras\keras\engine\training_generator.py", line 229, in fit_generator callbacks.on_epoch_end(epoch, epoch_logs) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras\keras\callbacks.py", line 76, in on_epoch_end callback.on_epoch_end(epoch, logs) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\extra\callbacks.py", line 277, in on_epoch_end self.evaluate(epoch, counter_name='epoch') File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\extra\callbacks.py", line 486, in evaluate split=s) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\extra\evaluation.py", line 60, in get_cocoscore score, = scorer.compute_score(refs, hypo) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\coco-caption\pycocoevalcap\ter\ter.py", line 50, in compute_score with open(self.ref_filename, 'w', encoding='utf-8') as f: FileNotFoundError: [Errno 2] No such file or directory: '/tmp/1867076.ref'

lvapeab commented 6 years ago

Hi,

it seems an error while obtaining the TER metric. I think it can be due to the use of Windows. I never tested the software (and I don't intend to) for Windows.

However, the computation of TER is not required. So, as a bypass, you can edit the file c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\extra\evaluation.py and comment the Line 51.

Cheers

frajos100 commented 6 years ago

Thank you Alvaro. Now faced with another issue. I am using my Windows laptop which is a cpu based. Do you suggest using Linux VM with GPU?

22/05/2018 12:02:22] Evaluating on metric coco Traceback (most recent call last): File "main.py", line 444, in train_model(parameters, args.dataset) File "main.py", line 167, in train_model nmt_model.trainNet(dataset, training_params) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\cnn_model.py", line 829, in trainNet self.train(ds, params) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\cnn_model.py", line 1088, in train initial_epoch=params['epoch_offset']) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras\keras\legacy\interfaces.py", line 91, in wrapper return func(*args, **kwargs) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras\keras\engine\training.py", line 1260, in fit_generator initial_epoch=initial_epoch) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras\keras\engine\training_generator.py", line 229, in fit_generator callbacks.on_epoch_end(epoch, epoch_logs) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras\keras\callbacks.py", line 76, in on_epoch_end callback.on_epoch_end(epoch, logs) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\extra\callbacks.py", line 277, in on_epoch_end self.evaluate(epoch, counter_name='epoch') File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\extra\callbacks.py", line 486, in evaluate split=s) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\keras-wrapper\keras_wrapper\extra\evaluation.py", line 59, in get_cocoscore score, = scorer.compute_score(refs, hypo) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\coco-caption\pycocoevalcap\meteor\meteor.py", line 40, in compute_score stat = self._stat(res[i][0], gts[i]) File "c:\users\f.fernandes\appdata\local\programs\python\python36\scripts\src\coco-caption\pycocoevalcap\meteor\meteor.py", line 69, in _stat self.meteor_p.stdin.flush() OSError: [Errno 22] Invalid argument

lvapeab commented 6 years ago

Hi,

this is a similar issue. You should comment Lines 55-56 from the same file.

I strongly recommend you to use a GPU. Otherwise, the training would be prohibitively slow.

frajos100 commented 6 years ago

Thank you so much. We have a GPU VM that we had stopped because even after 194k steps 2 weeks of GPU instance the translation quality from French to English did not improve link. Only if I am 60% sure would restart the GPU instance. As the translation quality depends on the dataset. Would you be aware of any location from where we could download and use the French to English or Hindi to English dataset that has a high translation quality. Also could we resume the training with the changed dataset?

lvapeab commented 6 years ago

The overall performance of a NMT (or SMT) system heavily depends on the quantity and quality of the data. You can check the WMT'18 for obtaining large datasets.

Moreover, the choice of hyperparameters, regularization strategies, etc of NMT is a critical process. I recommend you to read some tutorials on NMT (https://arxiv.org/pdf/1703.01619.pdf or https://arxiv.org/pdf/1709.07809.pdf).

lvapeab / nmt-keras

Error while training with the following command python main.py #49