Helsinki-NLP / OpusFilter

OpusFilter - Parallel corpus processing toolkit
MIT License
101 stars 18 forks source link

LanguageIDFilter filter error #6

Closed virgulvirgul closed 3 years ago

virgulvirgul commented 3 years ago

When using LanguageIDFilter filter I got an error below :

Traceback (most recent call last):
  File "./bin/opusfilter", line 27, in <module>
    of.execute_steps(overwrite=args.overwrite, last=args.last)
  File "/usr/local/lib/python3.7/dist-packages/opusfilter/opusfilter.py", line 109, in execute_steps
    self.step_functions[step['type']](step['parameters'], overwrite=overwrite)
  File "/usr/local/lib/python3.7/dist-packages/opusfilter/opusfilter.py", line 215, in filter_data
    removed = pairs_gen.n - idx
UnboundLocalError: local variable 'idx' referenced before assignment
steps:
  - type: filter
    parameters:
      src_input: tr2
      tgt_input: en2
      src_output: a.txt
      tgt_output: b.txt
      filters:
          - LanguageIDFilter:
              name: langid
              id_method: cld2
              src_lang: tr
              tgt_lang: en
svirpioj commented 3 years ago

I could produce this for the case that the output data is empty, i.e., the filter removes everything. There's now a fix in the master branch. Could you confirm that it helps?

virgulvirgul commented 3 years ago

Thank you. Error fixed now.