studioego / cjklib

Automatically exported from code.google.com/p/cjklib
Other
0 stars 0 forks source link

BMP-only character domain #10

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Character domains are employed to limit results to a certain domain of
characters. Currently these are domains like 'GB2312' or 'JISX0208' which
mirror official character sets. Domain 'Unicode' is the maximum domain
covering all Han characters.

A 'BMP' (Basic Multilingual Plane) character domain would limit the
Unicode domain to codepoints below 0xFFFF. This is important for systems
that currently don't handle characters beyond the 16-bit border.

Original issue reported on code.google.com by christop...@gmail.com on 11 Apr 2010 at 2:59

GoogleCodeExporter commented 9 years ago
Here is a patch that implements the BMP domain but in this form doesn't 
integrate
well.

Original comment by christop...@gmail.com on 11 Apr 2010 at 3:00

Attachments: