cubing / standards

🗃 Cubing Standards — technical specifications outside the current scope of the WCA.
https://standards.cubing.net
0 stars 0 forks source link

Extension syntax and pedantic parsing #3

Open rokicki opened 6 years ago

rokicki commented 6 years ago

I believe SiGN parsers should be pedantic: they should accept the standard and only the standard, and blow up if they see something that is not acceptable by the standard even if it's a reasonable extension to the standard. Not having pedantic parsing raises the spectre of incompatibility.

At the same time, people will want to annotate move sequences with comments, timestamps, narration, etc. For this I propose a we design an adopt a standard extension syntax that can be used for such purposes. The syntax we define should tell parsers how to demarcate an extension section, and ideally perhaps some sort of initial keyword parsing to determine what might be inside the extension syntax. For instance:

Curly braces demarcate extension blocks. An extension block starts with an open curly brace and ends with a close curly brace. Within an extension block, no curly braces may be used. Extension blocks may contain arbitrary whitespace including newlines (in those contexts where we might permit a sequence to contain newlines).

By convention an extension block should start with a short keyword, consisting only of alphanumeric characters, followed by a colon; this sets the type of the extension and permits parsers to determine if they want to try to parse the extension or not.

{timestamp: 100ms} might be a timestamp extension. {comment: .... } might be a comment extension.

By explicitly laying out a syntax for extensions, we permit parsers to be pedantic and interoperable without constraining the possible uses of the standard.

lgarron commented 6 years ago

I think that SiGN should stay a simple format without extensions for block turn puzzles – that's the spirit embodied in its name.

I think having an extension format for LGN makes sense, although I'm not certain it's either necessary or a great idea. One concern I have is that the spec is designed to document valid algs meant for humans to read. I think I want want timestamps to looks something like @1.2s, and other use cases may also want to use nice, concise syntax. If programs want to communicate more complicated structured data, they should probably use the JSON representation (where we can e.g. specify that programs should just drop move types they don't know). Do you have ideas in mind of extensions that are important to have in the human-serialized format?