issues
search
openai
/
tiktoken
tiktoken is a fast BPE tokeniser for use with OpenAI's models.
MIT License
12k
stars
818
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Add lint workflow
#349
esadek
closed
1 hour ago
2
Remove unused imports
#348
esadek
closed
1 hour ago
2
Python 3.11 wheel aarch64 missing for tiktoken 0.8
#347
hauntsaninja
closed
1 day ago
2
Build wheels for Python 3.13
#346
iisakkirotko
closed
2 days ago
2
Add replace spaces flag
#345
rishabhy
closed
6 days ago
0
Does tiktoken count only input tokens or output tokens as well?
#344
GildeshAbhay
closed
3 days ago
1
Tiktoken Permission denied error
#343
NewGHUser4321
opened
1 week ago
0
on the ipad
#342
torot123
closed
3 days ago
0
Add test ci
#341
arvid220u
closed
1 week ago
0
Is there a new tokenizer for o1 models?
#337
jiadingfang
closed
2 days ago
9
Fix repeated characters handling in BPE tokenization (e.g., 'RR' in 'Strawberry')
#336
Sachleens
opened
1 month ago
0
chatgpt-4o-latest is not yet added
#335
jvlinsta
closed
2 days ago
3
Facing erros in importing the o200k_base
#334
JaynouOliver
closed
3 days ago
9
Leveraging DP for bpe_encode function
#333
lordgavy01
closed
1 month ago
0
https://www.youtube.com/watch?v=8YnyAjkOap8
#332
Anand-her
closed
3 days ago
0
Uses Regex instead of fancy-regex - 6x speedup
#331
Majdoddin
opened
2 months ago
1
ValueError: not enough values to unpack (expected 2, got 1) when tiktoken.get_encoding("cl100k_base")
#330
hzh12345678
closed
1 hour ago
1
fix: add encoding for fine-tuned models based on gpt-4o
#329
hughcrt
opened
2 months ago
0
Counting image tokens for gpt-4o
#328
BleTib
closed
1 hour ago
2
When I was inputting long text into a large model, that is, when the len of the text was 1024*1024, a StackOverflow error occurred.
#327
YangQiangli
opened
2 months ago
0
U
#325
elnryv
closed
2 months ago
0
A
#324
Ayanle127
closed
2 months ago
1
RecursionError: maximum recursion depth exceeded while calling a Python object
#323
Hudrolax
closed
2 months ago
5
请问大家现在tiktok 退出了tiktok coin?
#322
danielng620
closed
2 months ago
0
Bunu uygulamaya göre ayarla
#321
marseko
closed
2 months ago
0
Send from DWG FastView(Android)
#320
marseko
closed
2 months ago
0
Cache for Encoding - Runtime Boosted by 12%
#319
Majdoddin
opened
2 months ago
0
DOC: Add a link toward PyPI tiktoken package.
#318
MaxJPRey
opened
2 months ago
0
[FR] Add `--offline`
#317
NightMachinery
opened
3 months ago
3
Optimal byte_pair_encode(), 6% faster, 0.6% better COMPRESSION
#316
Majdoddin
closed
3 months ago
1
ai
#315
MITCHELLNEAL1
closed
3 months ago
0
Add Terminal-Based Visualization Tool for Tokenized Data Points in Tiktoken Tokenizer
#314
LVivona
opened
3 months ago
0
Update README.md
#313
SmartManoj
closed
1 day ago
1
Support for GPT 4o
#305
jcrupi
closed
3 months ago
1
TikToken Tokenizer from scratch ?
#303
IsNoobgrammer
opened
4 months ago
0
I want to modify the code in self._core_bpe.decode_bytes(tokens).decode("utf-8", errors=errors)
#302
FanshuoZeng
closed
4 months ago
1
Unknown encoding gpt2
#301
aryagxr
closed
4 months ago
1
tiktoken 0.7.0 isn't compatible with python 3.11.*
#300
trenton3983
closed
4 months ago
3
Tiktoken educational BPE trainer takes long time to train with vocab size 30k
#299
sagorbrur
opened
4 months ago
2
`o200k_base` pretokenizer - regex error?
#298
AmitMY
closed
4 months ago
2
GPT4o出现低级bug:发现最新token中的垃圾语料及实测GPT4o胡言乱语出现幻觉
#297
alexhmyang
closed
4 months ago
3
Or
#296
jacob121532
closed
4 months ago
0
gpt-4o tokenizer
#295
nxfi777
closed
4 months ago
1
A character is splited into two tokens
#294
kerlion
closed
4 months ago
1
I need tiktoken win32 python3.8 version, can anyone provide it?
#293
loveFeng
closed
4 months ago
1
Combining marks and indic vowel marks within words are being split breaking all indic languages and most languages except English and CJKs
#292
ajaykg
closed
5 months ago
4
Error
#291
pedromothe5
closed
4 months ago
0
Use a custom exception ValueError subclass for the special tokens warning
#290
simonw
opened
5 months ago
0
Custom tokenizer fails to encode despite characters being in mergeable_ranks
#289
afang-story
closed
2 days ago
3
Understanding the intended behaviour of `_encode_bytes`
#288
ashleyholman
opened
5 months ago
0
Next