Closed jo0704 closed 2 years ago
Hi,
your vocab
seems to have been produced by a different script, and is invalid. Specifically, there are lines containing just one symbol (__en__
), and it's also not distinguishing between word-internal and word-final merge operations.
Hi,
When running
apply_bpe.py
to segment given texts with the generated vocabulary I get the following error:The exact command lines I used:
I added the vocab and train file I'm trying to segment: bpe_vocab.zip
A similar issue was reported here https://github.com/rsennrich/subword-nmt/issues/46 , but it doesn't seem to solve the error in my case.