yethee / tiktoken-php

This is a port of the tiktoken
MIT License
100 stars 22 forks source link

Improved performance of converting text into tokens #10

Closed yethee closed 5 months ago

yethee commented 5 months ago

Encoding text is x2 times faster compared to the version 0.3.0.

Measured for a text of 39k symbols of the Latin script. (baconipsum on screenshot)

image

Benchmark report for version 0.3.0 ![image](https://github.com/yethee/tiktoken-php/assets/559488/d1680fac-25e6-4045-ad7b-b158c6bab099)

Ref to issue: #6