Yoctol / strpipe

text preprocessing pipeline
Other
5 stars 0 forks source link

Need fullwidth -> halfwidth normalizer ?? #54

Open SoluMilken opened 5 years ago

SoluMilken commented 5 years ago

numbers ? english characters ? anything else?

absolutelyNoWarranty commented 5 years ago

Range U+FF01–FF5E reproduces the characters of ASCII 21 to 7E as fullwidth forms. U+FF00 does not correspond to a fullwidth ASCII 20 (space character), since that role is already fulfilled by U+3000 "ideographic space".