This library currently handles conversion between strings and sets of codepoints, in an attempt to provide an intuitive and easy to use API. It may be a better idea to require the codepoint conversion to take place before input to this library.
This would allow for systems using a third-party UTF-8 implementation such as utf8. It also neatly avoids the issue of what encodings PRECIS deems valid. For example, from RFC 7613:
An entity that prepares a string according to this profile MUST first
map fullwidth and halfwidth characters to their decomposition
mappings (see Unicode Standard Annex #11 [UAX11]). This is necessary
because the PRECIS "HasCompat" category specified in Section 9.17 of
[RFC7564] would otherwise forbid fullwidth and halfwidth characters.
After applying this width-mapping rule, the entity then MUST ensure
that the string consists only of Unicode code points that conform to
the PRECIS IdentifierClass defined in Section 4.2 of [RFC7564]. In
addition, the entity then MUST encode the string as UTF-8 [RFC3629].
This library currently handles conversion between strings and sets of codepoints, in an attempt to provide an intuitive and easy to use API. It may be a better idea to require the codepoint conversion to take place before input to this library.
This would allow for systems using a third-party UTF-8 implementation such as utf8. It also neatly avoids the issue of what encodings PRECIS deems valid. For example, from RFC 7613:
(emphasis mine)
See discussion under #1 for more information.