Open tats-u opened 1 month ago
You can use our new grapheme actions to count emojis that we added in v1.0.0-beta.1: https://github.com/fabian-hiller/valibot/releases/tag/v1.0.0-beta.1
@fabian-hiller
new grapheme actions
The number of UTF-16/32 code points per grapheme is unlimited. You should combine maxGraphemes
with this maxCodePoints
or maxLength
.
https://stackoverflow.com/questions/71011343/maximum-number-of-codepoints-in-a-grapheme-cluster
If you write your backend in Go or Rust, UTF-32 length is commoner than UTF-16. (utf8.RuneCountInString(str)
or str.chars().count()
)
Thank you for your detailed feedback! How would you implement such an action? We also have byte actions like maxBytes
but not sure if this is what you are looking for.
We can implement it based on the existing maxLength
. Compare the result of codePointAt
per character with 0x10000 and move the cursor forward by one more character if necessary.
You can combine maxBytes
with others too. For a password, it can't be longer than 72 bytes if you hash it by bcrypt. It's compatible with maxCodePoints
or maxLength
.
Can you provide a code example for the if-statement to check the maximum code points?
The length limit of VARCHAR in some RDBs is the number of UTF-32 code points. maxLength counts an emoji and some kanji as two.
Password requirements by NIST:
https://pages.nist.gov/800-63-3/sp800-63b.html
This requires we should count an emoji (not compounded ones) or other 4-byte chracters as 1 character in a password.