issues
search
dmitry-brazhenko
/
SharpToken
SharpToken is a C# library for tokenizing natural language text. It's based on the tiktoken Python library and designed to be fast and accurate.
https://www.nuget.org/packages/SharpToken
MIT License
214
stars
14
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
add support for utf8 input and output
#46
abdulkareemnalband
opened
2 months ago
2
SharpToken.GptEncoding.GetEncoding(o200k_base) takes minutes
#45
vovanb
opened
4 months ago
1
GPT-4o mini encoding
#44
vovanb
closed
4 months ago
1
[duplicate] Support for o200k_base and gpt-4o (omni) model
#43
dmitry-brazhenko
closed
6 months ago
2
Gbt4o support
#42
daniell0gda
opened
6 months ago
0
Support for o200k_base and gpt-4o (omni) model
#41
winzig
closed
6 months ago
4
o200k_base support
#40
omri-suissa-clearmash
closed
6 months ago
1
Add gpt-4o and variants of it to ModelToEncodingMapping
#39
splattne
closed
6 months ago
2
cohere support
#38
omri-suissa-clearmash
opened
7 months ago
0
Add pointer to Microsoft.ML.Tokenizers
#37
ericstj
closed
7 months ago
0
Pr 33 (Feature/performance: This PR introduces a high number of performance improvements)
#36
dmitry-brazhenko
closed
8 months ago
0
Pipeline fix 2
#35
dmitry-brazhenko
closed
8 months ago
0
Pipelines update
#34
dmitry-brazhenko
closed
8 months ago
0
Feature/performance: This PR introduces a high number of performance improvements.
#33
r-Larch
closed
8 months ago
3
Anthropic (claude) support
#32
omri-suissa-clearmash
opened
8 months ago
5
Added test to run all encoding/decoding subtests at once in parallel …
#31
vwilson
closed
9 months ago
1
GptEncoding thread-safe?
#30
omri-suissa-clearmash
closed
9 months ago
4
text-embedding-3-small", "cl100k_base
#29
dmitry-brazhenko
closed
9 months ago
0
text-embedding-3-large mapping
#28
dmitry-brazhenko
closed
10 months ago
0
New embedding models not recognized
#27
thomasdc
closed
10 months ago
11
readme fix
#26
dmitry-brazhenko
closed
11 months ago
0
.net8 build target framework
#25
dmitry-brazhenko
closed
11 months ago
0
Minor fix - name: Write SNK file
#24
dmitry-brazhenko
closed
1 year ago
0
Minor changes (Readme + builds fix)
#23
dmitry-brazhenko
closed
1 year ago
1
Model prefix encodings
#22
ian-cameron
closed
1 year ago
0
Implement MODEL_PREFIX_FOR_ENCODING
#21
ian-cameron
closed
1 year ago
2
Add gpt-3.5-turbo-16k model to encoding mapping support
#20
anthonypuppo
closed
1 year ago
1
gpt-3.5-turbo-16k Not Supported
#19
sallahbaksh
closed
1 year ago
1
testing without Condition="Exists('$(KeyFilePath)')"
#18
dmitry-brazhenko
closed
1 year ago
0
Strong name assembly
#17
dmitry-brazhenko
closed
1 year ago
0
Strong name assembly
#16
dmitry-brazhenko
closed
1 year ago
0
{ "gpt-35-turbo", "cl100k_base" }, // Azure deployment name
#15
dmitry-brazhenko
closed
1 year ago
0
Build and readme fix
#14
dmitry-brazhenko
closed
1 year ago
0
version 1.1.*
#13
dmitry-brazhenko
closed
1 year ago
0
Readme fix
#12
dmitry-brazhenko
closed
1 year ago
0
Encode speed improvement
#11
dmitry-brazhenko
closed
1 year ago
0
outdated lib Newtonsoft
#10
Augustukas
closed
1 year ago
1
Decode speedup
#9
dmitry-brazhenko
closed
1 year ago
0
One more test plan https://github.com/dmitry-brazhenko/SharpToken/iss…
#8
dmitry-brazhenko
closed
1 year ago
0
Incorrect token count with Cyrillic
#7
vermorel
closed
1 year ago
2
Missing model name for 3.5 turbo Azure Deployment model
#6
bsstahl
closed
1 year ago
5
Made compatibble with netstandard2.0
#5
dmitry-brazhenko
closed
1 year ago
0
Made compatibble with netstandard2.0
#4
okhosting
closed
1 year ago
1
Add Microsoft.SourceLink.GitHub to publish repo info/commit to nuget.org
#3
kzu
closed
1 year ago
1
Performance compared to TiktokenSharp, Tokenizer
#2
lofcz
closed
8 months ago
7
Added encoding initialization by model name
#1
dmytrostruk
closed
1 year ago
1