openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.57k stars 5.53k forks source link

implement an efficient bpe function using stack #339

Open copyrightly opened 6 months ago

copyrightly commented 6 months ago

The original bpe function looks for the bigram with the smallest bpe_ranks value in a word then merge this bigram, and repeating this process in a while loop.

We use a stack to implement the process in a linear O(n) complexity, and the code is also much simpler than before.

See the performance comparison below: text is from https://www.reedbeta.com/blog/programmers-intro-to-unicode/ my_enc.encode represents the encode function using the original bpe fucntion, and my_enc.my_encode represents the encode function using the modified bpe function

Screenshot 2024-05-25 at 11 03 48 PM