Anders429 / simfile

Apache License 2.0
0 stars 0 forks source link

Parsing of `msd`-style tags #16

Closed Anders429 closed 3 years ago

Anders429 commented 3 years ago

Many simfile formats use msd-style tags, which are of the format #TAGNAME:param0:param1:param2; for an arbitrary number of parameters.These should be uniformly parsed for all of these formats to reduce code duplication.

For the most part, this will be straightforward. However, it should be noted that dwi has a #BACKGROUND tag that unfortunately has a few exceptions to this rule, with ; characters being required within the tag's parameters.

Anders429 commented 3 years ago

As best I can evaluate, MSD-style tags can be generically specified as follows:

A parameter is any string of characters with the characters #, :, ;, \, and optionally / escaped (the / case is only to avoid comments when two / are present sequentially).

A parameter list is one or more parameters, separated by a single : character, terminated by a ; character. Note that these parameters can be empty.

A tagged parameter list is the character # followed by a parameter list.

A comment is the characters //, followed by any string of characters, terminated by a new line character \n. These are no-ops and are not part of any parameters.

An msd-style file is a sequence of parameter lists, tagged parameter lists, comments, and whitespace characters, in any order.

Note that a parameter list can have its terminating character ; elided when it is followed by a new line and a tagged parameter list.

At this point I'm not sure if tagged parameter lists have to always appear on new lines, however. It seems like that is the standard, but the SM5 MSD parser only cares about it being after a newline for ; elision. For now, no requirement will be placed on newlines surrounding tagged paramter lists.

Anders429 commented 3 years ago

Note also that comments should be skipped when parsing, as they cannot be preserved between file formats and therefore should simply be dropped.