Open Totktonada opened 6 years ago
I'd avoid addition of any Tarantool specifics to utf8 module. Let's expose more classes instead. @Totktonada @Khatskevich , could you pls specify, which classes of ICU symbols should we expose additionally to make you happy?
@kostja initially asked for the feature in avro-schema, but it seems there are no much need in this now. So I'll unassign myself and Roman.
Proposed to expose it via the new utf8 module.
There are two variants how to do so: add
isident
to check just one symbol (to provide consistentis*
API) or add a function to check an entire string. Both are okay for us.We need to forbid some symbols (like period) in our identifiers, so there are two way to handle that: add forbidden symbols parameter for the identifier_check function (or likely add separate function) or perform such check outside in Lua using utf8.next (in the case no extra changes are needed in the scope of this issue).
There is concern (@Khatskevich) that we should expose
identifier
symbol class from Tarantool and should not link it with avro-schema identifiers. We can expose printable characters class instead (it is just terminology question). We should make decision whether we want to support the 'valid identifier' term for use in tarantool applications / modules.