The original bpe function looks for the bigram with the smallest bpe_ranks value in a word then merge this bigram, and repeating this process in a while loop.
We use a stack to implement the process in a linear O(n) complexity, and the code is also much simpler than before.
See the performance comparison below:
text is from https://www.reedbeta.com/blog/programmers-intro-to-unicode/my_enc.encode represents the encode function using the original bpe fucntion, and my_enc.my_encode represents the encode function using the modified bpe function
The original bpe function looks for the bigram with the smallest
bpe_ranks
value in a word then merge this bigram, and repeating this process in a while loop.We use a stack to implement the process in a linear O(n) complexity, and the code is also much simpler than before.
See the performance comparison below:
text
is from https://www.reedbeta.com/blog/programmers-intro-to-unicode/my_enc.encode
represents the encode function using the original bpe fucntion, andmy_enc.my_encode
represents the encode function using the modified bpe function