issues
search
OpenPecha
/
Botok
🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python
https://botok.readthedocs.io/
Apache License 2.0
58
stars
15
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
催更帮助文档!
#108
Tshor
opened
5 months ago
0
Missing English words at the end of the text during sentence tokenization
#107
BLKSerene
opened
11 months ago
0
Make error handling more robust when downloading dialect packs
#106
BLKSerene
opened
11 months ago
1
Remove unnecessary print messages
#105
mikkokotila
closed
1 year ago
0
Splitting མངས་བས་ wrong?
#104
lothelanor
opened
1 year ago
0
fix: create new release manually
#103
10zinten
closed
1 year ago
0
Update test.yml
#102
10zinten
closed
1 year ago
0
Update test.yml
#101
10zinten
closed
1 year ago
0
Revert "add normalization code"
#100
10zinten
closed
1 year ago
0
Create test.yml
#99
10zinten
closed
1 year ago
0
Update publish.yaml
#98
10zinten
closed
1 year ago
0
add normalization code
#97
eroux
closed
1 year ago
0
Update and rename publish.yaml to CI_CD.yaml
#96
10zinten
closed
1 year ago
0
fix(resources): Create bo_punct_position.csv
#95
ngawangtrinley
closed
1 year ago
1
[Feature] Classify all PUNCTs into left and right
#94
10zinten
opened
1 year ago
0
Can we remove "Loading Trie... (1s.)" message
#93
mikkokotila
closed
1 year ago
0
`token.text_unaffixed` failed to add tsek
#92
10zinten
opened
2 years ago
0
Missing pos for PUNCT
#91
10zinten
opened
2 years ago
0
syllable component
#90
kaldan007
opened
2 years ago
0
syllable tokenizer request
#89
ta4tsering
opened
2 years ago
0
importing a custom dictionary
#88
eroux
opened
2 years ago
1
issue with Python 3.9
#87
eroux
opened
2 years ago
0
identifying weak syllables
#86
eroux
opened
2 years ago
1
POS tags ? distinguishing some patterns
#85
eroux
opened
2 years ago
2
fix(sent-tokenizer): normalised sentence is included in sentence tokens
#84
kaldan007
closed
2 years ago
0
Unexpected skip
#82
kaldan007
closed
3 years ago
0
Unexpected syl skip
#81
kaldan007
closed
3 years ago
0
Unexpected skip of syllable while tokenizing.
#80
kaldan007
opened
3 years ago
0
Invalid index in merge rule silently produces uncalled for result.
#79
kaldan007
opened
3 years ago
0
Why VOWELS constant only has one vowel?
#78
forest-jiang
opened
3 years ago
1
Download of dialect packs fails on macOS when running CI
#76
BLKSerene
opened
4 years ago
1
detect any language
#75
ngawangtrinley
opened
4 years ago
0
dict like `get` method for Token object
#74
10zinten
opened
4 years ago
0
understanding custom pipelines
#73
mikkokotila
opened
4 years ago
3
minimal instructions/docstring for Trie
#72
mikkokotila
closed
4 years ago
1
Directory based config
#71
10zinten
closed
4 years ago
2
Multiprocessing tokenization
#70
10zinten
closed
1 year ago
5
Check existence of the latest resource files before downloading
#69
BLKSerene
closed
2 years ago
2
bad segmentation
#68
drupchen
closed
4 years ago
1
AttributeError: 'NoneType' object has no attribute 'append'
#67
eroux
closed
4 years ago
13
batch process files
#66
drupchen
closed
4 years ago
0
Missing lemma for numbers
#65
10zinten
closed
4 years ago
3
multi-threading
#64
mikkokotila
opened
4 years ago
6
Github Actions for CI
#63
mikkokotila
closed
4 years ago
3
labels
#62
mikkokotila
opened
4 years ago
1
statistics performance with tokenizer.list_word_types
#61
mikkokotila
opened
4 years ago
3
from pybo to botok
#60
drupchen
closed
5 years ago
1
Path issue after frozen with PyInstaller on macOS
#59
BLKSerene
closed
5 years ago
3
Tokenizer improvement
#58
drupchen
closed
4 years ago
2
pybo 0.6.0 tokenizer failed for འིའོ
#57
10zinten
closed
5 years ago
5
Next