tsackton / taelgar

0 stars 0 forks source link

Major update to token parsing and formatting #25

Closed tsackton closed 9 months ago

tsackton commented 9 months ago

This is a major update to token parsing and string formatting. Introduces a new TokenParser class, that replaces StringFormatter.

The main function is formatDisplayString, which takes a display string and formats the tokens appropriately, returning the formatted string. A token is a string with a format like: <(prefix)token:filterformat(suffix)>

Prefix and suffix are optional, and can contain any character, including whitespace. Token is required and must be letters only (upper and lower case are fine, but all tokens are converted to lowercase for processing, so endDate and enddate are not distinct). Filterformat is a filter and format string, which can contain any characters except :() or spaces. In can consist of a single string (a), or two or three strings separated by ;, e.g. (a;b) or (a;b;c).

This processing can be altered if needed / desired, but so far in testing seems to work as expected.

formatDisplayString also takes a file object (should have a file.name and a file.frontmatter property), a targetDate (which is the date at which to format the string), and an overrides object, which is appended (and overwrites) the file metadata. The main uses so far of this are:

Some bugs are not fully resolved. Current issues:

tsackton commented 9 months ago

The Mirror of Soul Trapping has a "RangeError: Maximum call stack size exceeded" which is probably an infinite loop somewhere. Vindristjarna has same error. --> fixed by 4698766 Trailing parentheses are being removed somewhere for page dated information and for affiliations, but not for secondary info line (apparently). --> fixed by eae071e

tsackton commented 9 months ago

Pages with bugs:

Still remaining to address otherwise:

msackton commented 9 months ago

Pages with bugs:

  • The Wave Dancer (Dataview: Failed to execute view '_scripts/view/get_Whereabouts.js'. TypeError: Cannot read properties of null (reading 'split') (have not found this on other whereabouts: check override formats for errors)
  • Wella Brightmoon (Dataview: Failed to execute view '_scripts/view/get_Affiliations.js'. TypeError: Cannot read properties of undefined (reading 'matchAll') (arises from some, but not all leader affiliations; non-leader affiliations seem fine; also fails on Kaeso, who has a plain array for an affiliation; might have to do with dates?)

Both of these are fixed by https://github.com/tsackton/taelgar/commit/847d853bc3a35350e62cb0ab2b2a5c9b14221dde

(The issue with affiliations was the lack of an aNoDates string in metadata.json; the issue with Wave Dancer was a lack of handling an undefined format string)

tsackton commented 9 months ago

casing only happens for the actual token, not the prefix; so e.g. Battle of Urlich Pass reports: "part of The [[Great War]]" which should probably be "Part of The [[Great War]]". The display string here is: "<(part of )partof:ty>" so maybe this is a display string bug not a code bug (e.g., should it be "<(Part of )partof:ty>"?

This is fixed, at least for Battle of Urlich Pass, by c4c0d63. Currently if there are different casing rules for format and firstFormat, format always applies to the prefix. Conceptually it might be more correct to apply firstFormat to the prefix and format to the suffix?

Alternatively, applying casing rules to prefix/suffix text might not be worth the hassle. The main impact is if you have something like "<subtypeof(,)> <(part of)partof:u>" where you don't want "part of" capitalized unless there is no subtype.