Tieske / lua-resty-ljsonschema

Pure Lua JSON schema validator for Lua and OpenResty
https://tieske.github.io/lua-resty-ljsonschema/
MIT License
12 stars 6 forks source link

The string length derived from `#` operation for those including non-ASCII characters is calculated incorrect #29

Open liverpool8056 opened 2 days ago

liverpool8056 commented 2 days ago

The implementation of string length calculation in the lib uses the # operator for string length validation of json ( such as schema.minLength and schema.maxLength ). For cases that the string has some non-ASCII characters, this will not be correct. For instance, for strings encoded in UTF8, due to UTF8 is multibyte encoding, the actual visual length is less than what we get by # operator as this operator in lua counts the bytes length of the string in binary. And the same result we will get by string.length(s). Actually, lua doesn't provide a built-in function to count the visual length as lua is not aware of the encoding.

I'm wondering if we can inject a dedicated function as a substitution through validatorlib, something as the following:

local function default_string_length(s)
  return #s
end

local customlib = {
  null = custom and custom.null or default_null,
  array_mt = custom and custom.array_mt or default_array_mt,
  match_pattern = custom and custom.match_pattern or default_match_pattern
  str_length = custom and custom.str_length or default_string_length
}
local name = custom and custom.name
return generate_main_validator_ctx(schema, custom):as_func(name, validatorlib, customlib)
Tieske commented 2 days ago

Yup, I think this is the right approach.