eemeli / message-resource-wg

Developing a standard for Unicode MessageFormat 2 resources
4 stars 1 forks source link

Add a parser for the resource format #13

Closed eemeli closed 1 year ago

eemeli commented 1 year ago

Defines a function with this signature:

/**
 * Parse input into a MessageResource CST.
 * Should never throw an Error.
 *
 * @param source - The full source being parsed
 * @param onError - Error handler, may be called multiple times for bad input.
 */
function parseCST(
  source: string,
  onError: (range: CST.Range, msg: string) => void
): CST.Resource

The resulting CST.Resource corresponds closely to the ABNF, and represents all of its given input. Errors are emitted via a side channel, without stopping the parse.

For entry and section identifiers, their full string[] representation is parsed, and errors emitted for duplicates. Declaring a shorter id after having already done so for a matching longer one is considered an error, i.e. a message a.b always needs to be defined before a.b.c.

Values are not parsed yet; that seems like an obvious next step. Might end up including that here.

Tests are included, with nearly 100% coverage.

eemeli commented 1 year ago

Moving to messageformat/messageformat#396, as that's a better place for this.