hplt-project / sacremoses

Python port of Moses tokenizer, truecaser and normalizer
MIT License
486 stars 59 forks source link

NameError: name 'words' is not defined in truecaser #87

Closed butsugiri closed 4 years ago

butsugiri commented 4 years ago

Hi, thank you for your great work.

I was using sacremoses truecase and encountered following error:

  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/bin/sacremoses", line 11, in <module>
    sys.exit(cli())
  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/lib/python3.6/site-packages/click/core.py", line 764, in __call__
    return self.main(*args, **kwargs)
  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/lib/python3.6/site-packages/click/core.py", line 717, in main
    rv = self.invoke(ctx)
  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/lib/python3.6/site-packages/click/core.py", line 1137, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/lib/python3.6/site-packages/click/core.py", line 956, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/lib/python3.6/site-packages/click/core.py", line 555, in invoke
    return callback(*args, **kwargs)
  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/lib/python3.6/site-packages/sacremoses/cli.py", line 195, in truecase_file
    print(moses.truecase(line, return_str=True), end="\n", file=fout)
  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/lib/python3.6/site-packages/sacremoses/truecase.py", line 265, in truecase
    tokens = self.split_xml(text)
  File "/home/kiyono/.pyenv/versions/miniconda-3.9.1/envs/chainer4/lib/python3.6/site-packages/sacremoses/truecase.py", line 349, in split_xml
    and len(words) > 0
NameError: name 'words' is not defined

I have not looked into the details of the code, but it seems that the variable words is indeed not defined in the method. Maybe it should be tokens instead?

HaukurPall commented 4 years ago

Hi butsugiri,

I also ran into this issue and attempted to fix it. It seems to work for me but the pull request with my changes has not been approved. You could try pulling from my fork to fix the issue or just do the changes yourself.

Hope it helps =)

alvations commented 4 years ago

Thanks @HaukurPall for the fix! And thanks @butsugiri for raising the issue.

The issue is fixed in the latest commit and version.

pip install -U sacremoses==0.0.40