issues
search
NMZivkovic
/
BertTokenizers
Open source project for BERT Tokenizers in C#.
MIT License
83
stars
22
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[BUG] BertUncasedBaseTokenizer ran forever with input "SixGe1−xH"
#28
darren-zdc
opened
4 months ago
2
Wrong vocabulary index after white space
#27
wegylexy
opened
7 months ago
0
Word piece tokenizer never exits if a sub-word token doesn't exist
#26
matteocontrini
opened
1 year ago
1
Fixing tokenizers to correctly handle linux line endings (\n)
#25
palenshus
opened
1 year ago
0
Strings with linux line endings break the tokenizer
#24
palenshus
opened
1 year ago
0
Fix wrong naming #22
#23
tsepton
opened
1 year ago
2
Custom vocabulary classes naming error
#22
tsepton
opened
1 year ago
0
Looks for Vocabularies in source dir instead of (e.g.) bin/release/net6/
#21
ctwardy
opened
1 year ago
0
The tokenization for Korean text seems not correct.
#20
terryqj0107
opened
1 year ago
0
support .net462
#19
amitportnoy
opened
1 year ago
1
This does not match behavior of Huggingface's Python version
#18
gevorgter
opened
1 year ago
9
Words surrounded by backwards quotation marks causing inaccurate tokenization results
#17
rghavimi
opened
1 year ago
0
fix unicode of multilingual vocab
#16
zhipenghan
opened
1 year ago
0
Multilingual vocab not code properly
#15
zhipenghan
opened
1 year ago
0
Classes for custom vocabulary
#14
NMZivkovic
closed
2 years ago
0
Support for loading a custom vocab.txt?
#13
BrainSlugs83
closed
2 years ago
3
different behavior: hugging face bert-base-uncased vs. BERT Base Uncased
#12
PaulCalot
closed
2 years ago
3
Updated Readme.md
#11
NMZivkovic
closed
2 years ago
0
Update Readme.md
#10
NMZivkovic
closed
2 years ago
0
Update README.md
#9
NMZivkovic
closed
2 years ago
0
CI/CD Pipeline - Automaticly publishing NuGet Package
#8
NMZivkovic
opened
2 years ago
0
Supporting .NET 5 and .NET 6
#7
NMZivkovic
closed
2 years ago
0
Remove unnecessary files
#6
NMZivkovic
closed
2 years ago
0
Migrated to .NET6
#5
NMZivkovic
closed
2 years ago
0
Multilingual model tokenization differs from Python
#4
ADD-eNavarro
closed
2 years ago
3
Fixing tokenizer to select for >=2 instead of 2. Resolves discrepena…
#3
DanMMSFT
closed
2 years ago
0
BertTokenizers in .NET 3.1 and/or .NET 6.0
#2
ADD-eNavarro
closed
2 years ago
2
Issue using the code
#1
bentoo
closed
2 years ago
1