Error at the end of train.py

angelo-megna94 commented 4 years ago

Hi @louismartin , i'm sorry to bother you, but I have a problem at the end of the training. Everything is going well, it comes to the last epoch but in the end it gives me the following error:

File "scripts/train.py", line 53, in fairseq_train_and_evaluate(kwargs) File "/home/usr/Scrivania/folder/access/access/utils/training.py", line 18, in wrapped_func return func(*args, *kwargs) File "/home/usr/Scrivania/folder/access/access/utils/training.py", line 29, in wrapped_func return func(args, kwargs) File "/home/usr/Scrivania/folder/access/access/utils/training.py", line 38, in wrapped_func result = func(*args, kwargs) File "/home/usr/Scrivania/folder/access/access/utils/training.py", line 50, in wrapped_func result = func(*args, *kwargs) File "/home/usr/Scrivania/folder/access/access/fairseq/main.py", line 125, in fairseq_train_and_evaluate parametrization_budget) File "/home/usr/Scrivania/folder/access/access/fairseq/main.py", line 91, in find_best_parametrization recommendation = optimizer.optimize(evaluate_parametrization, verbosity=0) File "/home/usr/Scrivania/folder/venv/lib/python3.6/site-packages/nevergrad/optimization/base.py", line 543, in optimize return self.minimize(objective_function, executor=executor, batch_mode=batch_mode, verbosity=verbosity) File "/home/usr/Scrivania/folder/venv/lib/python3.6/site-packages/nevergrad/optimization/base.py", line 503, in minimize self.tell(x, job.result()) File "/home/usr/Scrivania/folder/venv/lib/python3.6/site-packages/nevergrad/optimization/utils.py", line 150, in result self._result = self.func(self.args, self.kwargs) File "/home/usr/Scrivania/folder/access/access/fairseq/main.py", line 62, in evaluate_parametrization return combine_metrics(scores['BLEU'], scores['SARI'], scores['FKGL'], metrics_coefs) KeyError: 'BLEU'

I can't understand why he gives me this error, easse and fairseq are updating, as are all the other libraries.

Can you help me? Thank you in advance

louismartin commented 4 years ago

Hi @angelo-megna94 I think it might be due to the names of the metrics being lowercased in the latest version of easse. Can you please add print(scores) just before the combine_metrics() function ? You will most likely need to lowercase the metric names to combine_metrics(scores['bleu'], scores['sari'], scores['fkgl'], metrics_coefs)

louismartin commented 4 years ago

Hi @angelo-megna94 ,

Did you solve the issue ?

angelo-megna94 commented 4 years ago

Hi @louismartin ,

I'm sorry if I didn't make myself heard but I was engaged in something else. In these days I do everything and update you

louismartin commented 4 years ago

Ok perfect thanks, good luck, tell me if it works and then I will fix the bug :)

angelo-megna94 commented 4 years ago

Hi @louismartin ,

training went ok, setting metrics to lowercase was the right thing to do. You can calmly fix it.

If it is not too much, I would have another question: to know the SARI and BLEU score that I got I have to run evaluate.py right? If yes, do I have to consider the value of SARI or SARI_LEGACY for the SARI?

Thanks

louismartin commented 4 years ago

Yes you need to run evaluate.py. You should modify the file a little bit though:

evaluate.py should point to your model
The recommended_preprocessors_kwargs need to be updated with those that were found during training and printed to stdout if you want to evaluate on TurkCorpus. If you want to evaluate on another dataset, you should find the best recommended_preprocessors_kwargs for this dataset.

SARI_LEGACY is what other people used up until now but the new SARI fixes many bugs. You can find the new SARI scores of various systems (ACCESS among others) in table 3 of this paper: https://arxiv.org/abs/2005.00352

angelo-megna94 commented 4 years ago

The recommended_preprocessors_kwargs need to be updated with those that were found during training and printed to stdout...

You mean the values shown on the terminal at the end of the training?

For example he returned to me: _recommended_preprocessors_kwargs = {'LengthRatioPReprocessor': {'target_ratio': 0.98}, 'LevenshteinPreprocessor': {'targetratio': 0.79}, etc etc}

louismartin commented 4 years ago

Yes exactly those, they are optimized for maximizing SARI on turkcorpus

angelo-megna94 commented 4 years ago

Okay, thanks a lot for the help. In my case I am using a custom dataset that simulates the turkcorpus structure, so I think I will take the values it gave me back. It is sufficient for my current studies, despite not getting a high SARI.

louismartin commented 4 years ago

Ok perfect, good luck, any more questions or I can close the issue?

angelo-megna94 commented 4 years ago

You can close. Have a nice day!

facebookresearch / access

Error at the end of train.py #13