issues
search
diasks2
/
pragmatic_segmenter
Pragmatic Segmenter is a rule-based sentence boundary detection gem that works out-of-the-box across many languages.
MIT License
541
stars
54
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
#78: Fixed issues with catastrophic backtracking when detecting numbered references
#83
mjansing
closed
1 month ago
0
Sentence segmentation is not working as per the golden rule for sentence "At 5 a.m. Mr. Smith went to the bank. He left the bank at 6 P.M. Mr. Smith then went to the store."
#82
spsingh2020
opened
3 months ago
0
Dependency from Unicode creates circular dep that breaks most of my bundle installs
#81
palladius
opened
4 months ago
0
Quotation mark at the beginning of a sentence breaks segmentation
#80
arp
opened
9 months ago
0
#78: Fixed issues with catastrophic backtracking when detecting numbered references
#79
pschijven
closed
3 months ago
0
Catastrophic backtracking in regular expression for numerical references
#78
pschijven
opened
10 months ago
0
Abbreviations at end of sentences + unknown abbreviations
#77
Lightgreen40
closed
7 months ago
0
German abbreviation "z. B." is split when it shouldn't
#76
coezbek
opened
2 years ago
0
SyntaxError in `pragmatic_segmenter-0.3.22/lib/pragmatic_segmenter/list.rb` using jruby-9.3.2.0
#75
ghost
opened
2 years ago
0
Abbreviation for `inches` is segmented, e.g "9 in."
#74
coezbek
opened
2 years ago
0
French à after abbreviation / Min. abbreviations
#73
coezbek
opened
2 years ago
0
Golden rule for telephone numbers with letters?
#72
coezbek
opened
2 years ago
0
Detect paragraph breaks or keep whitespacing?
#71
coezbek
opened
2 years ago
1
Unable to segment text. Ruby 3.
#70
malhotrachetan
closed
3 years ago
1
pragmatic_segmenter installing problem
#69
sevilaybayatli
closed
2 years ago
5
Refactor for Ruby 3.0 compatibility
#68
alextsui05
closed
3 years ago
4
Cannot segment text on Ruby 3
#67
alextsui05
closed
3 years ago
0
pragmatic segmenter not installing
#66
sevilaybayatli
opened
3 years ago
2
French 3 petit point is not handle.
#65
jgcb00
opened
3 years ago
0
replace_parens_in_numbered_list() calls scan_lists() twice
#64
simnalamburt
opened
4 years ago
0
`Washington, D.C.` at end of sentence not segmented.
#63
alankalb
opened
4 years ago
0
How to normalize sentences with different quotation marks?
#62
aristotll
opened
4 years ago
0
Language support
#61
juliusfrost
opened
4 years ago
1
Punctuation removed even with clean turned off
#60
echan00
opened
5 years ago
0
Segmenter modifies the segment
#59
echan00
closed
5 years ago
2
Incorrect segmentation or intended behavior?
#58
echan00
closed
5 years ago
0
How to add additional rules about Chinese?
#57
krongk
opened
5 years ago
0
Infinite Loop
#56
censored--
opened
5 years ago
2
Spanish text not correctly parsed when there is no space after a period
#55
lefman
opened
5 years ago
0
Run as a Service?
#54
jeffrschneider
opened
5 years ago
1
Take advantage of non-breaking spaces
#53
hftf
opened
5 years ago
0
Naming and attribution for port of code
#52
EliotJones
closed
6 years ago
1
add test cases, fix abbreviation behaviour for kazakh. gitignore vscode
#51
EliotJones
closed
6 years ago
1
Feature/file formats
#50
EliotJones
closed
6 years ago
1
Feature/english test cases
#49
EliotJones
closed
6 years ago
1
add viz with test case to the list of common abbreviations
#48
EliotJones
closed
6 years ago
1
Kazakh Segmenter
#47
sevilaybayatli
opened
6 years ago
20
Unexpected sentence break when parentheses immediately follow abbreviation with period
#46
reczy
closed
6 years ago
3
doc_type
#45
djstrong
opened
6 years ago
1
tatar.rb
#44
ftyers
opened
6 years ago
1
Instructions for using on the command line
#43
ftyers
opened
6 years ago
1
Preserving characters between sentences?
#42
cheerfulstoic
closed
6 years ago
6
Text Chunking
#41
Immortalin
opened
6 years ago
0
reducing object allocations
#40
maia
closed
6 years ago
1
wrong segmentation
#39
krsna-sentieo
closed
6 years ago
1
return String instead of PragmaticSegmenter::Text
#38
maia
opened
6 years ago
0
reduce memory usage by reusing segmenter
#37
maia
opened
6 years ago
0
Whitespace getting mangled even with clean turned off
#36
akhudek
opened
6 years ago
10
failed test case if there are book quote marks and exclamatory mark in Chinese sentence
#35
rainchen
closed
6 years ago
3
[WIP] Support danish
#34
mollerhoj
closed
6 years ago
5
Next