unicode-rs / unicode-normalization

Unicode Normalization forms according to UAX#15 rules
https://unicode-rs.github.io/unicode-normalization
Other
158 stars 40 forks source link

Does this handle invalid unicode? #102

Closed kirawi closed 4 months ago

kirawi commented 4 months ago

This is required for Unicode Collation tests which includes invalid unicode strings. I haven't read the algorithm, but is it fine to just do char::from_u32(n).unwrap_or('\u{FFFD}')?

Manishearth commented 4 months ago

None of the APIs accept anything other than str or char, so there is no way of providing invalid unicode to this crate. Using the replacement character is one typically-acceptable way of working with that.