issues
search
google
/
sentencepiece
Unsupervised text tokenizer for Neural Network-based text generation.
Apache License 2.0
10.25k
stars
1.17k
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
coredump when build with CXXFLAG `-Wp,-D_GLIBCXX_ASSERTIONS`
#966
samchugit
closed
8 months ago
4
RuntimeError
#965
fkurushin
closed
9 months ago
1
Merging tokenizers issue
#964
gordicaleksa
closed
9 months ago
4
Official support for Android compilation in Release/Assets
#963
JordiFB
closed
9 months ago
1
Additional external absl fixes
#962
Halmoni100
closed
9 months ago
0
Evaluate Profile-Guided Optimization (PGO)
#961
zamazan4ik
opened
10 months ago
0
Same oov count while using different vocab size
#960
shreyasinghal-17
closed
10 months ago
2
Revert "Bump the github-actions group with 2 updates"
#959
taku910
closed
10 months ago
0
Extract & modify the merge rules from the .model file of a SentencePiece BPE model
#958
kitkhai
closed
10 months ago
1
Bump the github-actions group with 2 updates
#957
dependabot[bot]
closed
10 months ago
0
How to safely extend vocabulary?
#956
ekurtulus
closed
10 months ago
3
Hash-pin Python dependencies in CI/CD release workflows
#955
pnacht
closed
10 months ago
0
Segmentation fault (core dumped)
#954
ivankrylatskoe
closed
8 months ago
2
c++ API compilation problem
#953
wangning7149
closed
10 months ago
1
Refactor spm_encode
#952
vmarkovtsev
closed
10 months ago
2
Encode file remainder
#951
vmarkovtsev
closed
10 months ago
2
Tips for Termux installation
#950
Manamama
opened
10 months ago
7
Error when compiling for AMD with HIP C++ compiler (`hipcc`)
#949
goliaro
closed
10 months ago
0
latency 100ms
#948
eigen2017
closed
10 months ago
1
fix(cmake): fix android build error
#947
chenqy4933
closed
10 months ago
0
ImportError: cannot import name 'SentencePieceProcessor' from 'sentencepiece' (unknown location)
#946
nikonikoni02
closed
11 months ago
2
FileNotFoundError: [WinError 2]
#945
chenzhaobo
closed
10 months ago
2
crash in absl::~Flag when used in python along with PyTorch
#944
mandeeplearning
closed
8 months ago
2
[Question] Is sentence splitting required or optional?
#943
malteos
closed
11 months ago
1
Adding New Argument min_freq to SentencePieceTrainer.train in Python
#942
hhwer
closed
8 months ago
2
Does bpe support num_threads parameter?
#941
Jieni05
closed
11 months ago
2
How to use EncodeAsIds for text that contains `<s>/</s>'
#940
vgoklani
closed
11 months ago
3
Trainer incorrectly concatenating input files
#939
Andrew-Gautier
closed
9 months ago
2
Set minimal permissions for GitHub workflows
#938
pnacht
closed
10 months ago
1
Ensure workflows run with minimal permissions
#937
pnacht
closed
10 months ago
0
Bump the github-actions group with 1 update
#936
dependabot[bot]
closed
10 months ago
0
What is the appropriate vocabulary length setting?
#935
2088208
closed
11 months ago
1
Hash-pin GitHub Actions, add dependabot
#934
pnacht
closed
11 months ago
0
Hash-pin GitHub Actions used in workflows, keep them updated with Dependabot
#933
pnacht
closed
11 months ago
0
Unable to install sentencepiece on Python 3.12
#932
enricogandini
closed
8 months ago
3
[Question] How does encoding work?
#931
99991
closed
1 year ago
2
the value Conflict.
#930
xzdong-2019
closed
11 months ago
3
About the corpus used to train the vocab
#929
tszslovewanpu
closed
1 year ago
2
Seg fault
#928
chiamp
closed
8 months ago
3
A library that conflicts with the use of protobuf in vcpkg
#927
hhxdestiny
opened
1 year ago
1
Fix tests
#926
kuba--
closed
1 year ago
1
No module named 'sentencepiece' even though sentence piece is installed
#925
Andyple
closed
1 year ago
1
A recent EMNLP work to share about task-adaptive tokenization with variable segmentation
#924
lsy641
opened
1 year ago
4
Does 'multi-word' mean to extract pieces like "Hello_world" ?
#923
lsy641
closed
1 year ago
0
Expose Advanced API?
#922
jerinphilip
closed
1 year ago
1
Process small chunk
#921
yiyangh-ps
closed
1 year ago
1
Better External Abseil and Protobuf Linkage Support
#920
Halmoni100
closed
9 months ago
3
C compatible api
#919
nullhook
closed
1 year ago
1
Build shared library on Windows?
#918
findmyway
closed
1 year ago
2
Skipping numbers in tokenization
#917
kpriyankavya
closed
1 year ago
1
Previous
Next