issues
search
grantjenks
/
python-wordsegment
English word segmentation, written in pure-Python, and based on a trillion-word corpus.
http://www.grantjenks.com/docs/wordsegment/
Other
365
stars
49
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Using with Additional corpus of spelling mistakes.
#39
willwade
opened
9 months ago
3
Bump wheel from 0.29.0 to 0.38.1
#38
dependabot[bot]
opened
1 year ago
0
Words segmenting in one direction, but not another.
#37
vrunhofen
closed
2 years ago
1
Moved the metadata out of `setup.py` into `setup.cfg`.
#36
KOLANICH
opened
3 years ago
2
'helloworld' does not segment as expected
#35
Forest216
closed
3 years ago
3
Add CHUNK_SIZE attribute to customize isegment()
#34
grantjenks
opened
3 years ago
0
RecursionError on segment call
#33
irmo322
opened
3 years ago
6
Support for Other Languages
#32
ykhatami
opened
3 years ago
2
feature_request(mode): preserve all punctuation marks
#31
Kristinita
opened
3 years ago
2
Training on new, modern data.
#30
sevmardi
closed
4 years ago
1
russian language
#29
vinnitu
closed
4 years ago
2
Corpus python
#28
reem1122-sys
closed
4 years ago
1
Support for maintaining original case
#27
esilgard
opened
4 years ago
1
Please allow separation of numbers from text
#26
prabhatM
opened
4 years ago
1
Can I use this from C or C++?
#25
PhilAndrew
closed
4 years ago
1
Correctly merge lowercase and uppercase bigrams
#24
kvakil
opened
5 years ago
0
allow substrings to be ignored if they have digits
#23
davidpaulmcintyre
opened
5 years ago
0
allow substrings to be ignored if they have digits
#22
davidpaulmcintyre
closed
5 years ago
0
`exhilarate` does not segment as expected
#21
mooosu
closed
5 years ago
1
Text with numbers doesn't segment as expected
#20
sgokhales
opened
5 years ago
3
Return a list of the most probable segmentations.
#19
rafaveguim
opened
6 years ago
3
import error
#18
ffxz
closed
6 years ago
2
unigrams
#17
raedaf
closed
6 years ago
1
How to add custom values?
#16
vebsun
closed
6 years ago
1
max() arg is an empty sequence
#15
desh2608
closed
6 years ago
1
License question
#14
kootenpv
closed
6 years ago
7
ZeroDivisionError
#13
wcollins-ebsco
closed
6 years ago
2
Buffering issue in main()
#12
dandelionred
closed
7 years ago
5
Added ability to easier load custom corpuses
#11
bgbg
closed
7 years ago
4
It can process any length of text without any recursion limit
#10
rdkadiwala
closed
7 years ago
5
Bigram doesn't work.
#9
moeseth
closed
7 years ago
3
fixed file open with unicode chars exception UnicodeDecodeError. Words file parsing is now lazy initiated.
#8
wavenator
closed
7 years ago
7
Recursion limit exceeded
#7
ChristosChristofidis
closed
7 years ago
7
Only Old word
#6
aongwachi
closed
8 years ago
4
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 1286: ordinal not in range(128)
#5
chnsh
closed
8 years ago
1
Updates travis to use the newer infraestructure
#4
javierhonduco
closed
9 years ago
0
Do not recreate alphabet in every clean call
#3
javierhonduco
closed
9 years ago
0
Training on new data
#2
jagadeeshraja
closed
9 years ago
2
Prior probability calculation question
#1
badc0re
closed
10 years ago
1