Lucretiel / kaydle

An alternative implementation of Kat's Document Language, including serde integration
Mozilla Public License 2.0
74 stars 4 forks source link

accept unicode as bare identifier #7

Closed tbmreza closed 1 year ago

tbmreza commented 3 years ago

The tests pass. What remains to be done for me is to figure out which error kinds to make.

Please have a look 🤓.

tbmreza commented 3 years ago

1

Lucretiel commented 3 years ago

Thanks you for the MR! Unfortunately I don't think it correctly implements the spec either. KDL identifiers are only concerned with code points, not grapheme clusters:

A bare Identifier is composed of any Unicode codepoint other than [decimal digits or non-identifier characters], followed by any number of Unicode codepoints other than non-identifier characters, [where a non-identifier character is]:

  • Any codepoint with hexadecimal value 0x20 or below.
  • Any codepoint with hexadecimal value higher than 0x10FFFF.
  • Any of \/(){}<>;[]=,"

Additionally, KDL identifiers can't contain any whitespace, though that's not called out explicitly by the spec (https://github.com/kdl-org/kdl/issues/188)

tbmreza commented 3 years ago

whoops.

I'll update my PR as soon as I can 🐱‍💻

tbmreza commented 3 years ago

@Lucretiel Ready for review.

Lucretiel commented 3 years ago

Thanks for the update, will review during the next stream (probably Monday)

Lucretiel commented 1 year ago

Sorry to say I totally lost track of this pull request and ended up implementing it myself. I used a similar design, but taking advantage of Chars::as_str to convert the chars iterator back into a string. Thanks you for the pull request, though!