issues
search
openai
/
tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
11.03k
stars
748
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Optimal byte_pair_encode(), 0.6% better COMPRESSION!
#316
Majdoddin
closed
2 days ago
0
ai
#315
MITCHELLNEAL1
closed
1 week ago
0
Add Terminal-Based Visualization Tool for Tokenized Data Points in Tiktoken Tokenizer
#314
LVivona
opened
2 weeks ago
0
Update README.md
#313
SmartManoj
opened
3 weeks ago
0
Support for GPT 4o
#305
jcrupi
closed
3 weeks ago
1
TikToken Tokenizer from scratch ?
#303
IsNoobgrammer
opened
1 month ago
0
I want to modify the code in self._core_bpe.decode_bytes(tokens).decode("utf-8", errors=errors)
#302
FanshuoZeng
closed
1 month ago
1
Unknown encoding gpt2
#301
aryagxr
closed
1 month ago
1
tiktoken 0.7.0 isn't compatible with python 3.11.*
#300
trenton3983
closed
1 month ago
3
Tiktoken educational BPE trainer takes long time to train with vocab size 30k
#299
sagorbrur
opened
1 month ago
2
`o200k_base` pretokenizer - regex error?
#298
AmitMY
closed
1 month ago
2
GPT4o出现低级bug:发现最新token中的垃圾语料及实测GPT4o胡言乱语出现幻觉
#297
alexhmyang
closed
1 month ago
3
Or
#296
jacob121532
closed
1 month ago
0
gpt-4o tokenizer
#295
nxfi777
closed
1 month ago
1
A character is splited into two tokens
#294
kerlion
closed
1 month ago
1
I need tiktoken win32 python3.8 version, can anyone provide it?
#293
loveFeng
closed
1 month ago
1
Combining marks and indic vowel marks within words are being split breaking all indic languages and most languages except English and CJKs
#292
ajaykg
closed
2 months ago
4
Error
#291
pedromothe5
closed
1 month ago
0
Use a custom exception ValueError subclass for the special tokens warning
#290
simonw
opened
2 months ago
0
Custom tokenizer fails to encode despite characters being in mergeable_ranks
#289
afang-story
opened
2 months ago
1
Understanding the intended behaviour of `_encode_bytes`
#288
ashleyholman
opened
2 months ago
0
Exception has occurred: ConnectionError HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url: /encodings/cl100k_base.tiktoken (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x000001F4D42B0EE0>: Failed to establish a new connection: [Errno 11001] getaddrinfo failed'))
#287
anithamudigoudar
opened
2 months ago
0
Tiktoken not installing on a macbook pro with m2 chip
#286
chaudhryna
closed
1 month ago
2
Optimize _byte_pair_merge function in BPE implementation
#284
naveens01
opened
2 months ago
0
how to convert qwen.tiktoken to tokenzier.model
#283
cloudyuyuyu
opened
2 months ago
0
SSLError: HTTPSConnectionPool(host='openaipublic.blob.core.windows.net', port=443): Max retries exceeded with url
#281
sijiashen
opened
2 months ago
6
Using offline: `.tiktoken` file gets deleted automatically on Linux
#279
nkilm
opened
2 months ago
3
Add handling for empty input text in encode method
#277
pratyakshagarwal
closed
1 month ago
1
Encode an empty string gives empty tokens
#276
flexwang
closed
2 months ago
2
K
#275
marseko
closed
2 months ago
0
How to find the token count of a prompt using meta/llama2-70b model
#274
pradeepdev-1995
opened
3 months ago
1
ImportError: cannot import name '_tiktoken' from partially initialized module 'tiktoken'
#273
sazirod
opened
3 months ago
0
minor fix
#272
igeni
closed
3 months ago
0
Remove dependency on requests
#271
tristan-jl
opened
3 months ago
0
Description of repository has a typo
#270
markusheimerl
closed
2 months ago
1
pinging tiktoken URL always fails, pinging api.openai.com always works
#269
distributev
opened
3 months ago
1
Create SECURITY.md
#268
Monkei49
closed
3 months ago
0
Enhanced Stability in Token Length Calculation for Whitespace Handling
#267
hvaria
closed
3 months ago
0
Create devcontainer.json
#266
79075
closed
4 months ago
0
Incorrect tokenization of "Elaborate"
#265
Sternbach-Software
closed
2 months ago
1
Can't install tiktoken==0.4.0 or tiktoken==0.5.1in Python 3.12
#264
JackObid
closed
4 months ago
2
TikTok-Unlock-master.zip
#263
79075
closed
4 months ago
0
Update README.md
#262
tic-top
closed
4 months ago
0
<|endoftext|>,Why can't ChatGPT recognize it?
#261
Kayce001
closed
2 months ago
1
In-code documentation update
#260
louisbrulenaudet
opened
4 months ago
0
Unable to route GET requests through proxy
#259
aevilevitch
opened
4 months ago
1
Add possessive quantifiers to avoid catastrophic backtracking
#258
paplorinc
opened
4 months ago
0
AttributeError: partially initialized module 'tiktoken' has no attribute 'get_encoding
#257
TimotejZavski
closed
4 months ago
1
Add turbo 16k model for Azure
#256
kartikagrawal2503
closed
4 months ago
1
Simplify byte_pair_merge
#255
hauntsaninja
closed
4 months ago
1
Next