rsennrich / subword-nmt

Unsupervised Word Segmentation for Neural Machine Translation and Text Generation
MIT License
2.18k stars 464 forks source link

Restoring BPE #93

Closed Oxi84 closed 4 years ago

Oxi84 commented 4 years ago

Hello,

What sed -r 's/(@@ )|(@@ ?$)//g' means?

Do you have a python code for this, from code, not from command line if that applies to command line?

Thanks

rsennrich commented 4 years ago

That's a simple regular expression that removes all '@@ ' in a line. Trailing '@@' is also removed. Equivalent Python code could look like this:

line = 'ab@@ def@@'
line = line.replace('@@ ', '')
if line.endswith('@@'):
  line = line[:-2]