issues
search
fnl
/
segtok
Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic features.
http://fnl.es/segtok-a-segmentation-and-tokenization-library.html
MIT License
170
stars
22
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
bug: split_contractions fails for certain patterns
#26
MattGPT-ai
opened
3 months ago
0
Remove license file from data_files
#25
PrimozGodec
closed
2 years ago
5
Fix deprecation warnings due to invalid escape sequences.
#24
tirkarthi
closed
4 years ago
2
Deprecation warning due to invalid escape sequences
#23
tirkarthi
closed
4 years ago
1
package license
#22
oblute
closed
4 years ago
2
text to sentences segmentation
#21
Tortoise17
opened
5 years ago
3
Word tokenizer does not split apostrophe and apostrophe s
#20
pwichmann
opened
5 years ago
2
Date gets segmented
#19
mnishant2
closed
5 years ago
4
Extensibility through custom regex or abbreviation lists?
#18
RyanMcCarl
closed
5 years ago
7
Issue with sentence separator
#17
arcticOak2
closed
6 years ago
4
Over-splitting on quotes with names.
#16
jakepoz
opened
6 years ago
1
Broken sentence terminal splice at token boundary in some cases.
#15
gkucsko
closed
6 years ago
2
Boundary without a space
#14
ml-pickle
closed
7 years ago
1
Sentence over-splitting on consecutive first-name abbreviations
#13
fnl
closed
6 years ago
2
Improper split at a first name abbreviation
#12
fnl
closed
5 years ago
1
Single-line splitting mode is joining sentence across lines
#11
yucongo
closed
7 years ago
3
Improper segmentation with proper names where middle initial is abbreviated
#10
christian-storm
closed
7 years ago
5
Failure to split on abberviations
#9
Klim314
closed
9 years ago
10
`web_tokenizer`: Unknown error on ",;" sequence.
#8
geovedi
closed
9 years ago
3
tokenizer usage
#7
dineshbvadhia
closed
9 years ago
1
segtok in programs
#6
dineshbvadhia
closed
9 years ago
2
Currency and percentages
#5
dineshbvadhia
closed
9 years ago
2
An issue about segment
#4
lixiangnlp
closed
9 years ago
1
Kmike py2 with stdio encoding fix
#3
fnl
closed
9 years ago
0
Python 3.3 and 2.7 support
#2
kmike
closed
9 years ago
4
Support Python 2.7
#1
wvengen
closed
9 years ago
5