OpenPecha Botok issues - Githubissues

OpenPecha / Botok

🏷 བོད་ཏོག [pʰøtɔk̚] Tibetan word tokenizer in Python

https://botok.readthedocs.io/

Apache License 2.0

58 stars 15 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

催更帮助文档！

#108 Tshor opened 5 months ago
0
Missing English words at the end of the text during sentence tokenization

#107 BLKSerene opened 11 months ago
0
Make error handling more robust when downloading dialect packs

#106 BLKSerene opened 11 months ago
1
Remove unnecessary print messages

#105 mikkokotila closed 1 year ago
0
Splitting མངས་བས་ wrong?

#104 lothelanor opened 1 year ago
0
fix: create new release manually

#103 10zinten closed 1 year ago
0
Update test.yml

#102 10zinten closed 1 year ago
0
Update test.yml

#101 10zinten closed 1 year ago
0
Revert "add normalization code"

#100 10zinten closed 1 year ago
0
Create test.yml

#99 10zinten closed 1 year ago
0
Update publish.yaml

#98 10zinten closed 1 year ago
0
add normalization code

#97 eroux closed 1 year ago
0
Update and rename publish.yaml to CI_CD.yaml

#96 10zinten closed 1 year ago
0
fix(resources): Create bo_punct_position.csv

#95 ngawangtrinley closed 1 year ago
1
[Feature] Classify all PUNCTs into left and right

#94 10zinten opened 1 year ago
0
Can we remove "Loading Trie... (1s.)" message

#93 mikkokotila closed 1 year ago
0
`token.text_unaffixed` failed to add tsek

#92 10zinten opened 2 years ago
0
Missing pos for PUNCT

#91 10zinten opened 2 years ago
0
syllable component

#90 kaldan007 opened 2 years ago
0
syllable tokenizer request

#89 ta4tsering opened 2 years ago
0
importing a custom dictionary

#88 eroux opened 2 years ago
1
issue with Python 3.9

#87 eroux opened 2 years ago
0
identifying weak syllables

#86 eroux opened 2 years ago
1
POS tags ? distinguishing some patterns

#85 eroux opened 2 years ago
2
fix(sent-tokenizer): normalised sentence is included in sentence tokens

#84 kaldan007 closed 2 years ago
0
Unexpected skip

#82 kaldan007 closed 3 years ago
0
Unexpected syl skip

#81 kaldan007 closed 3 years ago
0
Unexpected skip of syllable while tokenizing.

#80 kaldan007 opened 3 years ago
0
Invalid index in merge rule silently produces uncalled for result.

#79 kaldan007 opened 3 years ago
0
Why VOWELS constant only has one vowel?

#78 forest-jiang opened 3 years ago
1
Download of dialect packs fails on macOS when running CI

#76 BLKSerene opened 4 years ago
1
detect any language

#75 ngawangtrinley opened 4 years ago
0
dict like `get` method for Token object

#74 10zinten opened 4 years ago
0
understanding custom pipelines

#73 mikkokotila opened 4 years ago
3
minimal instructions/docstring for Trie

#72 mikkokotila closed 4 years ago
1
Directory based config

#71 10zinten closed 4 years ago
2
Multiprocessing tokenization

#70 10zinten closed 1 year ago
5
Check existence of the latest resource files before downloading

#69 BLKSerene closed 2 years ago
2
bad segmentation

#68 drupchen closed 4 years ago
1
AttributeError: 'NoneType' object has no attribute 'append'

#67 eroux closed 4 years ago
13
batch process files

#66 drupchen closed 4 years ago
0
Missing lemma for numbers

#65 10zinten closed 4 years ago
3
multi-threading

#64 mikkokotila opened 4 years ago
6
Github Actions for CI

#63 mikkokotila closed 4 years ago
3
labels

#62 mikkokotila opened 4 years ago
1
statistics performance with tokenizer.list_word_types

#61 mikkokotila opened 4 years ago
3
from pybo to botok

#60 drupchen closed 5 years ago
1
Path issue after frozen with PyInstaller on macOS

#59 BLKSerene closed 5 years ago
3
Tokenizer improvement

#58 drupchen closed 4 years ago
2
pybo 0.6.0 tokenizer failed for འིའོ

#57 10zinten closed 5 years ago
5