[Twee] Extending the Twee format syntax.

iftechfoundation / twine-specs

Specs related to Twine

70 stars 5 forks source link

[Twee] Extending the Twee format syntax. #1

Closed tmedwards closed 5 years ago

tmedwards commented 7 years ago

There's been talk of extending the Twee format to allow for additional properties to be recorded—e.g. notably the story map coordinates in both Twine 1 & 2—since, at least, as far back as Twine 1.4 was still in development.

Let's discuss what extensions we'd like to see and, naturally, what syntax changes would be necessary to accommodate them.

[x] Format name: Twee v3.
[x] File extensions: .tw, .twee.
[x] 100% compatibility with Twee v1 not required.
[x] Retain the readability of Twee v1 as much as possible.
[x] Encode the story and passage metadata as JSON. The story metadata should be pretty-printed—i.e., line-broken and indented. Passage metadata must be inline. Metadata decoding errors should be non-fatal warnings—i.e., discard the metadata and attempt to continue processing.
[x] Story metadata attributes (current):
- [x] ifdb: (string, required) Maps to <tw-storydata ifid>.
- [x] format: (string, optional) Maps to <tw-storydata format>.
- [x] format-version: (string, optional) Maps to <tw-storydata format-version>.
- [x] start: (string, optional) Maps to <tw-passagedata name> of the node whose pid matches <tw-storydata startnode>.
- [x] tag-colors: (object of tag(string):color(string) pairs, optional) Indirectly maps to <tw-tag> nodes.
- [x] tag:color pairs: (string:string, optional) Maps to <tw-tag name>:<tw-tag color>.
- [x] zoom: (decimal, optional) Maps to <tw-storydata zoom>.
[x] Passage metadata attributes (current):
- [x] position: (string, optional) Comma separated passage icon positional coordinates—e.g., 600,400.
- [x] size: (string, optional) Comma separated passage icon width and height—e.g., 100,200.
[x] Passage header must be a single line and is composed of:
- [x] Double colon start token (::), which must start the line.
- [x] Name, which must directly follow the start token.
- [x] (optional) Inline tag block, which must directly follow the name.
- [x] (optional) Inline JSON metadata, which must directly follow either the tag block or, if the tag block is omitted, the name.
[x] Compiler passages:
- [x] StoryTitle: The project's name. Maps to <tw-storydata name>.
- [x] Start: The default starting passage. May be overridden by the story metadata or compiler command.
- [x] StoryData: The story metadata passage. Must not be published to <tw-storydata>—i.e., the metadata should exist in one place.
[x] Compiler tags:
- [x] script: Signifies that the passage contents are JavaScript code. Maps to the <script type="text/twine-javascript"> node.
- [x] stylesheet: Signifies that the passage content are CSS style rules. Maps to the <style type="text/twine-css"> node.
[x] Within passage headers, passage and tag names must escape the optional tag and metadata block opening/closing metacharacters—i.e., [, ], {, }.
- [x] Encoding: The escapement mechanism is to prefix the escaped characters with a backslash—i.e., \. To avoid ambiguity, non-escape backslashes must also be escaped via the same mechanism—i.e., foo\bar yields foo\\bar.
- [x] Decoding: To make decoding more robust, any escaped character within a chunk of encoded text yields the character minus the backslash—e.g., \q yields q.
[x] Author-use area.

NOTES: The above task list is not meant to be absolute. There are probably additional items that I'm not thinking of at the moment that should be on the list.

tmedwards commented 7 years ago

I suppose the initial questions must be:

What extensions do we want to see?
- For example, I'd think that story map coordinates are definitely in the running.
Can we support all extensions via one syntax change? It would probably be best if we could keep this simple.
[Devil's advocate] Do we even need to extend the Twee syntax at all?
- For example, we could create additional Twine system tags instead—e.g. coordinates could be done as Twine.x### and Twine.y###. This could quickly become cumbersome, however, and doesn't align well with anything else—e.g. how the data is used internally and/or stored in the data chunks.

klembot commented 7 years ago

I don't do much with Twee anymore. But it seems to me that all we may need is a way to encode key/value pairs for passage metadata.

There are a number of other systems out there that we could use as points of comparison-- Squiffy is one that comes to mind right off the bat.

tmedwards commented 7 years ago

Assuming, by Squiffy, you mean the tool at textadventures.co.uk, I'm unsure how it's relevant. I mean, yes, I get that it's another text-based IF format. Its basic design philosophy is also fairly different from Twee/Twine. It was not my intention to suggest radical changes to the Twee format or its basic philosophy, so I'm unsure how helpful looking at Squiffy would be.

Perhaps I've misunderstood what you meant?

Obviously, any metadata should be sanely formatted—whether that requires encoding or not. Though, I'd think that keep it as human-readable as possible would be for the best. That's also kind of putting the cart before the horse.

I believe that we need to be able to store two types of metadata: project metadata and passage metadata.

Project Metadata

I think simply enshrining a special passage for the purpose would be sufficient. In fact, we could simply use the StorySettings special passage, which was introduced in Twine 1.4. While it's managed in Twine 1.4 via a dialog, the actual syntax of the passage contents is simple {key}:{value} pairs, one per line. For example:

debug:on

The current Twine 1.4 implementation seems to ignore pairs that it isn't setup to manage, so using StorySettings to store additional metadata should already be compatible with Twine 1.4.

As a personal anecdote. The current version of Tweego stores requisite Twine 2 project metadata—e.g. IFID—within StorySettings. I chose to use it since doing so didn't require changes to the Twee syntax and it was an existing special passage—whose purpose was specifically for storing compiler and/or story format metadata.

Passage Metadata

I think we should extend the passage header syntax to include an optional metadata section, because shoehorning metadata into the existing tag syntax doesn't seem like a smart way to go about it. Especially, if it's metadata which would be fairly common—e.g. story map coordinates.

Twine 1.4 and compatible compilers currently do use special tags to handle some metadata. I'm thinking specifically of the Twine.{thing} tags—e.g. Twine.image to denote image passages. These tags are virtually all special cases, however, and not something you see on every passage.

I still think we need to determine what passage metadata we are likely to want. I say this because there's no reason to come up with a system which can handle anything if we end up only using positional metadata.

For example, if we only need positional metadata, then key/value pars are probably overkill:

/* Overkill, if we only need positional metadata. */
:: Do the thing! [some tags] <x=250 y=400>

/* About the right amount of kill, if we only need positional metadata. */
:: Do the thing! [some tags] <250,400>

klembot commented 7 years ago

By mentioning Squiffy, my point was just that it can't hurt to take a look at the landscape and see how other people have solved this problem. I also wonder if we can borrow from YAML, since that's probably the most human-readable format I've seen to encode data.

Agree on the two types of metadata. I think we want to come up with a spec for arbitrary data and leave it as that instead of amending it as we go, because inevitably we'll want new things over time, so we may as well solve that problem once. (One big one that I can think of is that the next version of Twine will allow passages of different sizes.)

One question that comes to mind is, do we want to allow only simple values for metadata? e.g. would we someday want to encode arrays, or heaven forfend, nest structures?

I forget, do we have an official way to comment Twee soure code? It would be nice if whatever we come up with can be ignored by compilers that don't know about it, but I don't recall deciding on one.

tmedwards commented 7 years ago

Okay, yes, looking elsewhere definitely couldn't hurt. Squiffy is something of a red herring though, which is what threw me—as in, I don't think they've done anything similar, at least not for passage metadata.

I cannot for the life of me think of a good use case for anything other than simple values, say like the JavaScript primitive types. Maybe lists at the outside, but I'm struggling with what we might need that for. All I can think of is YAGNI—that's not really an argument though, I realize.

There is no official comment syntax, for the file as a whole or per-passage, baked into the format, no. There probably never seemed a need since Twee was merely a container for story format passages, which have their own comments. Of course, "official" here means, at best, as implemented by the official Twee compiler, since there is no official specification at present—I say, at best, because the twee compiler really hasn't been a going concern in a while, so other compilers likely have more mindshare.

There is something of an ad-hoc way to comment a Twee file as a whole. AFAIK, most Twee compilers ignore anything before the first passage header, so comments could placed there. For example:

This file may contain peanuts.

:: Snoopy [dog peanuts]
Meh....

EDIT / PS: The only thing all Twee compilers agree on, specifically, is the passage header—all three parts of it: sigil, passage name, and optional tag section. Ignoring anything before the passage header is mostly inertia, not something that's done for a specific reason.

greyelf commented 7 years ago

re: story map coordinates Instead of adding this data to each Passage within the twee file itself, would it make more sense to create a companion json file (eg. *.map) to store that data in. My reasoning is: a. The data is currently meaningless to the twee domain itself and is only really needed at the time of either: converting a twee project back to an application project; or creating a visual story map using some other method. b. Having the data in the twee file itself would allow an Author to easily corrupt/delete it. c. It allows easier access to the data to any third-party story map utility.

tmedwards commented 6 years ago

@greyelf I think having separate files is simply asking for problems. We have to consider the lowest common denominator here user wise.

Note: Markdown doesn't do lettered lists, so the following is an in-order reply to your list.

For decompiled Twine projects, it's true that coordinates would not be necessary to compile the resultant Twee source back into HTML. That said, if a user wants to import the recompiled HTML into Twine, the coordinates will come in handy, so….
Having it in a separate map file isn't really better. The map file could easily be lost. The user could rename a passage, but forget to also do so in the map file. Et cetera.
How? The map file would need to be in a format the utility understood. The map coordinates are pretty much useless by themselves, so the utility would still need to know how to parse Twee files. If the utility knows how to parse Twee files, then there's no point in using a different format. If there's no point in using a different format, then the Twee sources could be read just as easily as separate map files.

@klembot Have you had any other thoughts about this? I'd like to support coordinates soon in Tweego and I'd rather not just fiat something, but needs must.

Additionally. I'm planning to support colored tags via the StorySettings special passage, as I do with the IFID (mentioned above), though I haven't yet decided how to encode them. Any thoughts on that?

tmedwards commented 6 years ago

@Dan-Q @klembot @greyelf Paging Dr. Detroit.

greyelf commented 6 years ago

re: Passage Metadata Would it be a good idea to use a format similar to JSON, which may make easier to parse & extent as needed. It may also be easier for the end-used to understand, and (if the story format Developers add an associated API) for them to use within the story itself at run-time.

:: Do the thing! [some tags] {x: 250, y: 400}

videlais commented 6 years ago

Differences:

Looking across the Tweego, Twee2, and Twee (Twine 1) implementations detailed in this documentation work, the major differences seem to be in the following areas:

Coordinates (currently only Twee2 but is being considered for Tweego) [See on-going discussion over syntax]
Including other files. Tweego ignores StoryIncludes. Twee2 supports StoryIncludes for compiled files and also uses the @includes keyword. Twee (1) uses StoryIncludes.

(There are also differences in what tags are supported and how they are acted upon, but I consider that totally up to the tool developer's purview to add or support.)

Suggested Coordinate Syntax:

Twee2 uses less-than and greater-than signs (e.g. <123,456>). tmedwards seems to also like this as a "good amount of kill."

Greyelf has suggested the use of curly brackets (e.g. {x: 250, y: 400}).

Additional Issues:

klembot mentioned supporting the new passage size options in the format as well.

My opinion on that is to use whatever isn't used for the coordinate system as the "size" area, if that is also wanted.

Will Twine 2 ever support Twee again?

Part of this discussion is on how Tweego and Twee2 implement the "standard" way of doing Twee code. I can't remember if this was already answered, but I'd think an issue to discuss is also if Twine 2 will ever support exporting or importing Twee code. If it won't, then, well, the conversations around coordinates and even more so passage size support aren't as important, I'd think.

greyelf commented 6 years ago

If the 'size' of the Passage boxes needs to be tracked, do the (tag) colour assignments also need to be tracked?

tmedwards commented 6 years ago

Twee notation support in Twine 2

@videlais Even if Twine 2 never supports importing Twee notation—n.b. I think it should—this is still important for interoperability between Twee compilers and Twine 2.

For better or worse, people are already moving projects back and forth between compilers—generally, between Twine 2 and something else—so we're already living in that brave new world.

`StoryIncludes` special passage

The StoryIncludes was always a, somewhat hacky, way for Twine v1.4 to allow one project to include another, since Twine 1 projects are singular files.

Twee v1.4 supports it largely because it and Twine v1.4 share most of the same back-end code.

As a guess, Twee2 supports it for compatibility reasons—I assume DanQ actually preferred his @includes syntax.

I decided not to support it in Tweego for two reasons: it adds a layer of complexity in regards to path resolution and because it's completely unnecessary for a compiler which allows projects to span multiple files and directories. I could support it for compatibily's sake, but I think we're better off without it—and transitioning projects away from it is dead simple.

Passage Metadata

There's already been an expressed desire to support more kinds of passage metadata than simply coordinates. Beyond that, if we're going to standardize an extension to Twee's passage metadata format, then we really should make it forward looking so we don't need to do this again—that's the major problem with Twee2's coordinate extension.

@videlais I actually do not like Twee2's extension. I said, if we only need positional metadata then that would be fine. I agree with CK that it would be better if we came up with something which allowed arbitrary passage metadata.

@greyelf Using a structured data format—e.g. JSON—would be nice, but I'm a bit leery of something which could be easily broken by users—though perhaps I'm being overly cautious. On the story format side, there's no need for an additional API since passage metadata is encoded as HTML content attributes—i.e. there's already an API provided by the DOM at the basic level (beyond whatever story formats themselves provide; e.g. most give easy access to tags).

At the moment, I think I'd prefer to see something similar to HTML content attributes, since that would mirror what Twine 2 does fairly well. For example:

:: Do the thing! [some tags] {position="250,400" size="2"}

Parsing that could be problematic, however, so a structured data format which already has broad library support in various languages—e.g. JSON—might be a better idea. For example:

:: Do the thing! [some tags] {"position":"250,400","size":"2"}

Regardless of the format that's ultimately chosen, I think that we do want to mirror Twine 2's metadata as much as possible.

Story Metadata (was: tag colors)

@greyelf Though they're story metadata, I'd say that yes tag colors should be supported—in fact, they're on Tweego's TODO list.

My gut feeling is that we could/should include them in the StorySettings special passage[1] key/value pairs, though the actual mechanics of encoding them is up for debate.

For those unfamiliar with Tweego, I decided to use the StorySettings special passage for story metadata for a couple reasons: it's already the standard story metadata mechanism via Twine/Twee v1.4 and entries which are unknown to Twine/Twee v1.4 are ignored.

videlais commented 6 years ago

Twee notation support in Twine 2

@tmedwards: I agree. And that's kinda my point, albeit in an indirect way. People are already translating back and forth, so it behooves us to figure out what we all like and agree on so that other people have a standard way of translating Twee between things.

I don't know if it is directly @klembot's call or not, but I'd like to push that as something to put into Twine 2 at some point. Luckily, and something I'd already been using for work in the Twine Cookbook, there exists Entweedle. I could see something like that as an optional story format choice that work as it does now and simply show the passages as Twee code.

Moving away from `StoryIncludes` special passage

I also agree. I like the idea of @includes as something to think about to solve the same issue in a better way that could be an optional part of the standard.

Passage Metadata

I agree about the more JSON approach. Treating the metadata as something which could have any number of optional things like coordinates, colors, sizes, or other data would work well for this. Plus, like twedwards wrote, there exists a large number of JSON-parsing tools and libraries.

Where do we `Start`?

While I'm thinking about it, one of the issues with converting out of Twine 2 into Twee is that the 'start' passage is not marked as such. Right now, Entweedle is a one-way (out of Twine 2) process. If people are ever going to translate into Twine 2, a decision has to be made about what that looks like. In the Twine 1 era, that was the Start passage.

If we want to go with the assumed process that looks like Twine 1 but uses more of Twine 2's approach, that could simply be the first passage in the Twee content. Whatever is first in the file, that's the 'start' passage. It gets around the "Should we use a tag or metadata issue?" question through simply assuming the first one is always first and that any others are 'read' into and out of the storydata/store-area/passage HTML content in whatever order they appear from there.

greyelf commented 6 years ago

Where do we Start?

There are a number of issues with assuming that the startnode will be the first passage within TWEE file, two of them being: a. This is only guarantied to be true for an unmodified export of a Twine 2 project. b. It would require the re-write of all existing cookbook recipes (1)

TweeGo currently uses the StorySettings special passage to set the story's IFID, which like startnode is another story level value that needs to be tracked.

I suggest using cascading set of rules to determine the 'launch' passage during an import: a. Check the StorySettings special passage. b. Check for a Start passage. c. Assume the first non-special tagged passage..

(1) which have the StoryTitle special passage first and that is generally followings by the optional Story JavaScript and Story Stylesheet related special passages.

tmedwards commented 6 years ago

RE: Where do we `Start`?

There is actually no guarantee about which node is the starting passage in Twine 2 data chunks. User actions[1] can easily change which node is the starting passage, so whether via an unmodified export or not you cannot assume the starting node.

I have, generally, been of the opinion that enshrining the Start special passage is preferable, since that's the Twine 1/Twee default—frankly, dropping it in favor of PIDs is one of the things I think Twine 2 got absolutely wrong. Modern Twee notation compilers (Twee2 and Tweego) do allow you to select a different starting passage, but it's hard-coded in Twine 1/Twee, so that's a hard requirement if you were attempting to move your project to Twine 1—I'm assuming here that either no one would actually choose to use Twee (the compiler) or anyone who does probably isn't interested in interoperability with Twine 2.

That said. An entry in the StorySettings special passage would likely be friendlier when moving from Twine 2 to a Twee notation compiler—though, leading/trailing whitespace could be an issue, since Twine 2 considers such whitespace to be significant while, IIRC, Twine 1/Twee do not—I know Tweego does not and assume Twee2 is similar. That's a somewhat generic issue, however, as some (all?) Twee notation compilers and Twine 1 either forbid or ignore leading/trailing whitespace in passage names while Twine 2 currently allows it.

I don't just mean simply selecting a new passage as the starting passage—though that's part of it. Passage renaming, duplication, or various issues which require that passages be recreated can all cause changes in the node ID (PID) for a given passage.

videlais commented 5 years ago

`Start`

If I'm understanding everything here, it reads as if the issue is with Twine 2. Reigning in how it treats whitespace would bring it in line with the existing compilers and the assumptions of Twine 1. Putting the starting passage in the StorySettings special passage could potentially solve the issue of compatibility.

`StoryIncludes`, `@include`, or something else

Tweego doesn't support StoryIncludes (because it doesn't need to). Twee2 supports StoryIncludes AND @include.

Since I assume there won't be movement to adopt StoryIncludes by Tweego, how about talking through what a @include-directive might look like? In Twee2, it acts to merge the contents of another Twee file into the current one. Potentially, this would push Twine 2 in a major direction of supporting Twee directly, but it would also be an easier way to "package" things as a Twee file. (This would be unnecessary in Tweego, of course, but it would put both Twee2 and Tweego as the standard Twine 2 could then adopt.)

greyelf commented 5 years ago

I foresee having the ability to add an @include directive to the contents of a Twine 2.x Passage as a quick journey to "why doesn't it work" hell, especially if the end-user wants to use a Relative URL.

I would argue against adding such a directive to the twee standard if it means also needing to add support for such to the Twine 2.x application.

mcdemarco commented 5 years ago

I like @greyelf’s idea of a separate JSON file for Twine 2 metadata, but I would make it a special passage instead, or just a really large chunk of JSON inside the StorySettings.

I use a JSON StorySetting key:value pair for keeping settings for my proofing format DotGraph in my stories, e.g.:

dotgraph: {“rotation”: “LR”, etc.}

It’s been a bit fiddly and I wouldn’t recommend it for any text the user might actually want to edit.

That being said, I wanted to repeat some things I’ve said elsewhere, for the record:

Of the choices, I prefer the optional <10,30> notation for coordinates; if size is wanted it could be added as an (also preferably optional) third axis <10,30,size>. I don’t like the idea of putting coordinates into a long JSON string at the end of the passage title row, and certainly not if doing so for the sole purpose of some nebulous future functionality. I agree that YAGNI; we have had only one new feature of this sort in the entire history of Twine (passage sizes—coordinates always existed but in happier times twee just left them out) and no one suggested a potential new passage setting the last time I asked.

Still, it would do no harm to specify that the passage title row may include a currently-unused JSON object at the end, if it will settle the issue.

mcdemarco commented 5 years ago

Combining @klembot's suggestion of YAML and @tmedwards' observation that most Twee compilers ignore any text before the first passage sigil, another potential location/approach for story metadata is in a YAML header for the Twee file. I think I prefer that to both JSON and the untrammeled growth of StorySettings.

videlais commented 5 years ago

Based on a conversation with @mcdemarco and my own thoughts on it today, how about splitting Twee into versions. That is, as it stands now where it more-or-less works across tools is now Version 1.

Going forward, we start on a Version 2. Instead of trying to merge parts together, we can all work together to make a completely new version.

I feel like, at least for me, I was trying to find a way to make things work with Twine 1, but I think it might make more sense to work with Tweego and Twee2 to make a good, new standard we all like that then Twine 2 can use. Instead of anchoring ourselves to Twine 1, we can make a standard that is forward-facing and then, if needed, make tools to convert between them.

greyelf commented 5 years ago

I agree with the idea of splitting the Twee spec into two versions.

Current / As generated by Twine 1.x
New improved with bells & whistles.

@videlais I am confused by your "I was trying to find a way to make things work with Twine 1..." statement because it sounds like Twine 1.x is a road block to defining Version 1 of the TWEE spec.

I regularly work-on / debug other peoples Twine 1.x plus SugarCube based projects, and I do that by first exporting them to TWEE and then use TweeGo to build the WIP story HTML file (after I have added the required ifid in a StorySettings passage).

The only potentially inconvenient part is the cutting-n-pasting that may be required to get the new passage content back into the Twine 1.x TWS project if the Author has actually laid out the passage in a meaningful way, which isn't always the case.

mcdemarco commented 5 years ago

I guess I wouldn't want to go so far with the changes that a [single] Twee 1 file wasn't still, technically, a Twee 2 file and in particular readable by Twine 2 (should it start importing twee at all). I don't see any particular reason for a fresh Twee 2 file to be importable into Twine 1, considering that it's a fairly corner case that other tools could address.

Edit: I don't think we need to start out supporting the import/multi-file functionality of Twine/Twee 1; that seems like a bigger chunk of work than speccing the plain-text format per se.

videlais commented 5 years ago

@greyelf: That's an excellent point I hadn't considered. I think, in my mind, I was placing Twine 1 along with Tweego and Twee2. That is, trying to find a way to make something to work between them first and then building outward. It might make more sense to make something that can read Twee Version 1 but that exports Twee Version 2 by default.

Basically, only needing to change Tweego and Twee2 code to support a new import/export version. Twine 1 would continue to export older version and, moving forward, future tools used a new format by default. It'd be easier, I'd think, to add a new command-line flag of Twee versions than try to work through adding Twine 1 in the mix.

@mcdemarco: That's my thinking, too. Let's spec-out a new format. Twine 1 Twee is the base. It won't be changed at this point. If we started with "What did we learn from years of using Twee?" instead of "How do we hack in some new metadata?" we can put in more complex data handling and encoding that starts with readability in mind instead of trying to figure out how best to use the current system. I mean, let's not reject it outright, but we can use new symbols and be stricter about spacing, say, than trying to do everything Twine 1 Twee is doing.

videlais commented 5 years ago

Does working on a new (but similar!) format makes sense to you, @tmedwards? I'm not as well-versed in the issues as I'd like, but it seems like a few of us think making a new format, and removing Twine 1 from the equation (at least for now) would help us move forward.

As the developer for Tweego, do you feel that adding a new format via a flag or other functionality would be easier for you to implement?

@greyelf and @mcdemarco: Do we feel like it is time to start making some example of what the changes would look like? Like, if we want StoryIncludes to have JSON, what would that look like in practice? If we need to encode coordinates, size, and possibly colors (although that's only a beta thing in Twine 2 at the moment), would what that look like?

tmedwards commented 5 years ago

I didn't want to proliferate yet another Twee notation format—we already have two (one and a half?) with Twee2's coordinate extension—but if that's what it takes to move this forward, then I suppose I can live with that.

If we are going to draft a new format, then we should be careful to keep our concerns separate—i.e., core parts of the format vs. compiler and story format concerns. There's already been some conflation in this discussion which could become problematic down the road.

As to how to trigger processing of the format. My base preference would be automatically by file extension. In other cases, doing either would be fine—either meaning via extension or via the interface.

Structured Data

Using YAML is a complete non-starter as far as I'm concerned—YAML was an early response to XML, but it was most definitely not a good one. It would be an absurd amount of complexity, on both ends, for very little return. My biggest issue, aside from the general complexity, is that I don't think the current specification even covers all of its common corner cases. And you can forget differing implementations as they surely don't cover them, which makes interoperation a huge pain. Then there's all of the human-level ambiguity about what it parses into—both spec issues and general violations of the principle of least surprise. We do not want something which would routinely be biting end-users on the arse.

If we're going to draft a new format, then I'm also against using JSON. It's a fairly good format for it's originally intended purpose, which is data interchange—e.g., machines can parse it easily and humans can read and write it relatively easily. It's not so great, however, when using it for configuration and metadata, especially that end-users are supposed to edit by hand.

If we want to use a more-human-friendly structured data format for things like story metadata, then something like TOML would preferable.

videlais commented 5 years ago

I was interested to play with TOML some, so here's an example I created that uses what I personally think of as optional attributes coordinates, size, color, and two required attributes tags and content.

I ran it through the @iarna/toml Node module and was very easily able to parse it and pull out its attributes per "passage" (table) object.

# This is a TOML comment

title = "Twine TOML Example"

["Start"]
"Start".tags = "some, random, tags"
"Start".coordinates = [100, 100]
"Start".size = "large"
"Start".color = "red"
"Start".content = """
<include "Another Passage">
Show this text here"""

["Another Passage"]
"Another Passage".tags = "other, tags"
"Another Passage".content = """
This is another passage!"""

["$%^&*("]
"$%^&*(".tags = "89383737, 353678^^, ##$$()()("
"$%^&*(".coordinates = [10000000, -10]
"$%^&*(".content = """
Gibberish"""

It's not as readable as Twee, but as far as being able to add new metadata in the future, it's far superior. I could also see easily implementing StorySettings as a set of key-value pairs encoded this way.

Comments

Something I really love about this, and Twee does not have, is the ability to include comments. To include comments in Twee now, I've been using HTML comment elements, which is not particularly clean.

Common Corner Cases

What are some corner cases we haven't included yet, @tmedwards?

If we think making a list would help, that could be included here. I can think of some unusual things like trying to parse huge files that also contained complex metadata that could crash a browser parser.

mcdemarco commented 5 years ago

@videlais My original suggestion was to use YAML for the header and story metadata, not to change the general layout of a Twee file (the passage data). Your example of Twee as TOML only shows how much more readable the existing Twee format is (even with the Twee2 position data extension). Any changes that major should no longer be called Twee. I'd propose TweeML, but I don't think changing the twee format in a major way is a good direction to go in; no one has really requested that that I can tell. I thought we were close to pinning down some requested extensions to Twee; maybe discussion of a totally new format should go into a new GitHub issue.

@tmedwards I like the idea of a TOML header for a Twee file, as long as it, like YAML, would be ignored by old Twee processors. It seems more readable than YAML. I agree that JSON is not a particularly good direction to go in, either, but I'm not sure how else to handle requests for unspecified future functionality. TOML could certainly do it for the story metadata, but not for unspecified future passage data.

I was thinking about tossing comments into the JSON object I'd suggested adding to the end of the passage headers. We could put them there raw, instead, after a hash or another as-yet-unused character. We could also allow them in the TOML header. But handling comments in the full file is a big ask; the data is too free-form across story formats. (One man's unused comment character is another man's essential markdown directive, etc.) We'd have to use something we're reasonably certain isn't appearing in stories, like, say, triple colons at the start of a new line.

greyelf commented 5 years ago

and Twee does not have, is the ability to include comments.

I'm a little confused by this statement, or maybe I don't understand the type of comments you mean. Are we talking about which comment formats are supported by TWEE, or which are supported by the Story Formats themselves?

Because I currently use the following two formats of block based comments in my existing TWEE (plus SugarCube) related work-flow. They are imported correctly by the Twine 1.x application and TweeGo also has no issues with compiling files that include them.

:: Start
/% A markup based block comment %/
Some passage content.

/* A standard C / Java / JavaScript block comment. */
Some more passage content.

If we end up changing the Standard Passage Title & Content areas to be TOML based then I strongly suggest leaving the term Twee out of whatever name is given to the new file layout, as I believe that would just lead to confusion when an end-user tries to use some Old Twee they found on the internet within a New 'Twee' file.

tmedwards commented 5 years ago

First. I think we should adopt a Perl-ism as a guiding principle here: Easy things should be easy and hard things should be possible.

@greyelf I could be wrong, however, I believe the general idea is that @videlais would like one comment style to rule them all—i.e., one comment style that could be used regardless of the story format.

@videlais I'm in the same boat as @mcdemarco here. My structured data comments were meant for use as story and passage configuration/metadata, rather than to completely redefine the format.

Even with deciding to work on an extended/new Twee format (working name? TweeNG), rather than shoehorning these extensions into Twee-v1, I think the basic Twee-v1 format is mostly viable. I think we should still probably start with that as a base and work from there.

@mcdemarco I'm all for reserving the area before the first passage in TweeNG files as an author-use chunk, to be ignored by compilers—e.g., for comments or whatever. As for reserving that area as an actual structured data header, I'm unsure what we'd use it for. The only real need we have for non-passage config/metadata now is for the story, which is best handled in one place—e.g., the StorySettings special passage or something along those lines.

As for passage data, I'm unsure why TOML wouldn't be a good fit for that. Using an inline table seems easy enough. For example, a TOML inline table added to the current Twee passage header syntax:

/* <PASSAGE_NAME> <OPTIONAL_TAGS> <OPTIONAL_METADATA_AS_TOML_INLINE_TABLE> */
:: Do the thing! [some tags] {x=250, y=400}

As another example, we could use an inline table that includes everything but the passage name within the table:

:: Do the thing! {tags=['some', 'tags'], x=250, y=400}

I don't particularly like that, however, as it makes using tags more complicated than it needs to be for authors—i.e., it violates the easy things should be easy principle. Passage names and tags will be oft used by authors, so should be easy. Other passage header data will likely be rare, so may be more difficult—and largely ignored by the author; e.g., Twine 2 passage metadata.

File Comments

Defining file comments within the format should be straightforward. We simply reserve the area before the first passage in TweeNG files as either:

An author-use/comment chunk that is unprocessed by compilers.
A structured data chunk that is processed by compilers, which could include whatever comment styles are supported by the structured data format. Even if we did nothing with the area now, by forcing the structured data comment style we'd be future proofing it—though, again, I'm unsure what purpose a file data header would serve.

Personally, I think the first option is probably what we want—maybe I'm just lacking imagination here though.

Passage Comments

Defining passage comments within the format would be a huge pain. Outside of the Twee2 compiler, passage data is not, and has never been, processed by compilers, Twine or Twee, and I'm unsure that I'd like to change that—by "processed" I mean that compilers, aside from Twee2, treat passages' content chunks largely as black boxes.

Also. @mcdemarco is entirely correct about the trouble in finding markup we could use. We cannot use anything which would be valid: story format markup, CSS markup, JavaScript code. Complicating this is the fact that both existing and new story formats could introduce markup after the fact that any TweeNG passage comments could break—because compilers would eat them as TweeNG comments when they're actually story format markup; oops. Yes, the issue already exists with the passage header's opening token (::), but the point is that we probably don't want to add to the pile.

As to the triple-colon, I can't say that I'm a fan of extending the passage header token (::)—I think it would be too easy to screw that up. Comments in that form would also be valid passage headers in Twee-v1, which we probably shouldn't break if we're going to keep TweeNG largely similar.

Rather than attempting to add them to the TweeNG format, it might be better to simply get all story formats to agree on a common comment style—since the passage content chunk is story format specific anyway.

I know I'm being all negative Nancy here and I get why TweeNG passage comments would be nice. They're just a hard problem.

mcdemarco commented 5 years ago

@tmedwards The inline TOML for passage metadata sounds good and extensible (specifically, the option where tags are where they are in Twee now, not in the TOML). I was going by @videlais' example and my own impression that a new approach to story metadata was desired; I didn't realize you were suggesting it for passage metadata in particular.

I'm not sure I see a significant difference between putting newly-specced TOML story metadata into a new TOML header for the Twee file and putting the same TOML into the old StorySettings or another passage, except that special passage names seem like a weakness of the existing format and seem to have proliferated a lot more and more confusingly than the special passage tags---not that I'm a big fan of those, either.

One problem with using StorySettings rather than a new header is that story formats may already use StorySettings in their own ways and get confused, so I'd (sadly) recommend a whole new passage name for the purpose such as TweeMetadata or StoryMetadata (if the latter is actually unused).

I'm fine with leaving comments in the realm of the story format.

tmedwards commented 5 years ago

Special Names

@mcdemarco As far as special names are concerned, that ship's pretty well sailed—especially if we want this mostly similar to Twee-v1. They're the mechanism Twee has to differentiate its chunks, which are passages. I can't really see TweeNG being different in that regard since we've decided (?) to keep it similar, because altering that would be a pretty big change.

Besides, only three matter to TweeNG compilers: the StorySettings passage, script tag, and stylesheet tag—and the latter two only to know where to stick those chunks in story formats' data chunks. The proliferation you're referring to comes almost entirely from the story format side and, again, that's because special names are the core mechanism they have to differentiate code. They could do it in other ways, sure, but I can't think of any that would be as easy. Frankly, this seems like a RTFM kind of thing.

`StorySettings`

@mcdemarco You make a good point about story formats which use the StorySettings passage being confused upon receiving TOML, or any other structured data really, rather than their traditional key:value string pairs. That is definitely a concern.

That said, the only story formats I know of that use StorySettings are the Twine v1.4 vanilla story formats. Make of that what you will.

As I see it we have three options here (I'm assuming TOML):

Use a new TOML chunk. I'm going to say a new special passage, because we do not want users to be able to specify multiple copies of this thing that will either have to compete or be merged—either will lead to odd behavior and confusion when authors forget the one in that file they never open.
Use a TOML StorySettings and translate to the old key:value pair style for story data chunks. For most data—the kind currently used by the Twine v.1.4 vanilla formats—this should be straightforward and in the cases where it's not we could skip translation of the value.
Use a TOML StorySettings and do not worry about the Twine v1.4 vanilla story formats—and the two users who use them via Twine v1.4.

Each have their pros/cons. I'm going to assume no one's going to opt for option three. I'm not a fan of option two as it's fiddly and hampers the use of TOML. So, option one then? Something else I'm not thinking of?

mcdemarco commented 5 years ago

@tmedwards Doesn't TweeGo look for ifdb in StorySettings? Twee2 looks for a tag to decide where the twee2 settings are (including ifdb, though it's named something else), and the most likely passage someone will have put that tag on is StorySettings. DotGraph uses it, too, also in key: value format.

videlais commented 5 years ago

Catching up on things...

TOML

@mcdemarco: I definitely wasn't voicing strongly to jump to a total TOML solution. Sorry for the confusion there. I wanted to play with the format some, and I mocked-up an example to see what a total replacement might look like. It didn't take me too long to build it and test it with a standard parser.

You are absolutely right that it's far less readable as a general format. For encoding key-value pairs, it seems good, but its readable isn't great.

Comments

@greyelf: Comments in Twee. It'd be great if we could include comments in the Twee file itself outside of any particular story format support. I currently use HTML comments for that general purpose, as they will never be rendered by the browser, but I completely understand that's more of an edge case where I make example with comments.

@tmedwards wrote

Rather than attempting to add them to the TweeNG format, it might be better to simply get all story formats to agree on a common comment style—since the passage content chunk is story format specific anyway."

Yes, I totally agree. As someone who has been making examples for eight years now, that would be fantastic. But, that written, I'm not going to fight too strong for it. If we can solve these other problems, that would be Good Enough.

StorySettings

Right, tmedwards, that's my understanding as well. Twine 1.4 uses it the most and, it looks like, TweeGo supports the ifid setting. Twee2 doesn't use it.

I think a new passage name is probably the best idea, too. Encoding in TOML seems like a good idea for that.

Passage Metadata

/* <PASSAGE_NAME> <OPTIONAL_TAGS> <OPTIONAL_METADATA_AS_TOML_INLINE_TABLE> */
:: Do the thing! [some tags] {x=250, y=400}

I like this approach the best for the same reasons outlined. People working with the command-line tools would use the passage name and tags. If they wanted the extra metadata, they have the space. And, of course, as stated, we are mostly talking about Twine 2 metadata anyway. No other tool uses this space, but we are opening it up enough for others to any number of things. They'd just need to test for that data and use it if available.

Example

If I am following everything here, does this example look right? (Example metadata based on current Harlowe 1.4.2 tw-storydata element attributes as created by Twine 2.2.1.)

:: Start [] {x=100, y=100, size=[100, 100], pid=1}
This is the Start passage!
[[Another Passage]]

:: Another Passage [random tags] {x=200, y=100, size=[100, 100], pid=2}
Some content here, probably, I guess.

:: StoryMetadata
ifid = "2B68ECD6-348F-4CF5-96F8-549A512A8128"
name = "The Best Twine Example Evar"
format = "Harlowe"
format-version = "1.2.4"
startnode = "1"
creator = "Twine"
creator-version = "2.2.1"

tmedwards commented 5 years ago

Correction

I previously said that only three special names would matter to TweeNG compilers: the StorySettings passage, script tag, and stylesheet tag. That's obviously not necessarily true.

Currently, the Start and StoryTitle special passages are also used by and very important to Twee compilers—and Twine 1 style story formats. I'm not sure we should break that for TweeNG, since forcing authors writing it from scratch, rather than decompiled from a Twine 2 file, to create whatever the story data passage will be named and adding the necessary TOML pairs to it violates the easy things should be easy principle, IMO. Though, maybe doing something like what Tweego does for the IFID could work—which is a message telling the author where/how to add the IFID it just generated for them.

Story (Meta)Data

I didn't recall DotGraph using StorySettings, so there's that too.

That compilers currently use StorySettings with Twee-v1 files is irrelevant as long as TweeNG is supposed to be distinct from them—i.e., TweeNG files will be processed separately from Twee-v1 files. If we're rethinking that decision, then yes, that would be another issue on the pile of reasons to use a new name.

Personally, if we decide upon using a new name—and it looks like we are—then I'd prefer StoryData, to mirror Twine 2's name for the thing—ProjectData might be more on point, but lacks the usual Story in the name. I'm not married to any name in particular though.

@videlais Example

I think an actual example of passage headers would look more like the following:

:: Start {pid="1", position="100,100", size="100,100"}
:: Another Passage [tag1 tag2 tagN] {pid="2", position="200,100", size="100,100"}

I think we can make the tags inline block completely optional, even when specifying a metadata inline block.

I don't see a compelling reason to alter the key names, or the value types, from Twine 2 since we're, basically, just pulling its data along here. We could do so, I suppose, if we thought that was more user-friendly. That said, I don't know that we need to bother with pid—or I should say that Tweego probably wouldn't bother with it.

Your story data section looks fine, if a bit bloated. By that I mean, as one example, the creator and creator-version pairs are entirely superfluous since that's compile time information, rather than something which has meaning in the decompiled form. As another example, I think you can probably already guess that I have issues with Twine 2's startnode—and passage pid—inanity. Essentially, there's a question of how slavish we need and/or want to be here.

As a separate issue, which may be specific to Tweego or just me being the old man telling you damn kids to get off his lawn, I'm also conflicted about the format and format-version pairs. I can see the appeal of having that information within the story data, but doing so kind of impinges upon the command line interface, which leaves me ruffled.

A postscript about Tweego and `StorySettings`

Current versions of Tweego use StorySettings because:

It was an existing project settings special passage that was used to hold both general and compiler settings for the project, so I didn't have to either extend the Twee syntax or add yet-another-special-name that, hopefully, only made sense to Tweego—i.e., the route Twee2 took with its twee2 special tag.
Tweego could safely add new key/value pairs to it without breaking backward compatibility with Twine/Twee v1.4, which would ignore, yet retain, them—i.e., neither Twine/Twee v1.4 nor their vanilla story formats would be broken or confused by the new pairs and they'd survive a round trip through those compilers.

Tweego's not married to the passage or anything.

mcdemarco commented 5 years ago

@tmedwards My understanding of the backwards compatibility expectations for the new Twee is that new Twee readers will still read an old-style Twee file. That includes relatively recent Twee files for Twine2 story formats that may include an ifdb inside the StorySettings in the key:value format. Since TweeGo won't do anything useful without it, presumably these things are all over existing Twee files. So requiring only TOML in StorySettings is a non-starter. And having to parse StorySettings for TOML vs. non-TOML is not a road that seems worth going down considering the other options.

Personally, I think it's more convenient for the twee user to put together one StoryData passage (even if it's in TOML) that will more-or-less line up to what the Twine2 GUI puts into non-passage-named spots, than to muck around with the old passage names. But for backwards compatibility and differences of taste, maybe we should officially support both methods of doing many of these metadata things (rather than having two somewhat separate paths of old Twee vs. new Twee processing).

On a different topic, Twee2 supports putting the story format into a Twee settings passage. The main function of this in my experience is to bite you in the behind when you forget it's there, but I can see the point of preserving the information when converting from an html story file to a Twee file.

Including passage IDs is a bigger question; I don't really understand passage ID handling in the GUI and how exactly they might get changed by a trip through it. If the Twine2 GUI will essentially randomize them, we don't want to present them as a sort of stable data that the user might be tempted to edit manually or use in their code. Nor do we want a twee compiler (programmer) to have to deal with making possibly edited or reordered passage ids work in Twine2 if that's a difficult or error-prone problem to solve.

Something related to startnode is worth keeping as an alternative method of identifying the start passage, but not the startnode pid itself (even if we allow passage IDs), because there will be no pid's in de novo twee files.

greyelf commented 5 years ago

@mcdemarco

I don't really understand passage ID handling in the GUI

Simply put PIDs are a sequential series of integers (1-based and with no gaps) used to track the order that Passages were added to the project.

and how exactly they might get changed by a trip through it.

If you delete a passage that is earlier in the PID sequence (not the highest PID) then the PIDs of all passages that are later in the sequence will change (decremented by one) so that the PID sequence remains sequential (with no gaps). This can cause the startnode to become invalid if it referenced a PID that was later in the sequence than the one that was deleted, thus requiring the startnode to also be updated.

It is the need for PIDs to remain sequential with no gaps that is the main issue with tracking them in TWEE,

mcdemarco commented 5 years ago

@greyelf Yes, the pid's don't sound workable for inclusion in the Twee file.

klembot commented 5 years ago

Some additional info in case it comes up later...

PIDs are assigned at publish time, not while a story is being edited. Internally they get GUIDs in the editor but those are never part of a story file. The GUIDs are stable in the web version but not the desktop app (where it builds an internal DB every time it starts up).

My personal feeling is that authors should never have to think about PIDs or GUIDs.

tmedwards commented 5 years ago

@mcdemarco

My understanding of the backwards compatibility expectations for the new Twee is that new Twee readers will still read an old-style Twee file. That includes relatively recent Twee files for Twine2 story formats that may include an ifdb inside the StorySettings in the key:value format. Since TweeGo won't do anything useful without it, presumably these things are all over existing Twee files. So requiring only TOML in StorySettings is a non-starter. And having to parse StorySettings for TOML vs. non-TOML is not a road that seems worth going down considering the other options.

There would be no backwards compatibility expectations for TweeNG. That was the whole point of drafting a new format in the first place.

Keeping the two formats broadly similar, because Twee-v1 its pretty good at its job, doesn't mean that TweeNG is simply an extension of Twee-v1. If they are distinct formats, then they MUST be differentiable—i.e., compilers should be able to tell Twee-v1 files from TweeNG files by whatever mechanism. If compilers can do that, then your talking point here becomes irrelevant. If they cannot, then we have a serious problem.

This seems largely pointless anyway, since I'd already agreed that, due to various issues, it probably makes more sense to use another passage for the story metadata. From my previous reply:

Personally, if we decide upon using a new name—and it looks like we are—then I'd prefer StoryData, to mirror Twine 2's name for the thing—ProjectData might be more on point, but lacks the usual Story in the name. I'm not married to any name in particular though.

I'm unsure why you're dredging up some of the possibilities I outlined a few replies ago as our most likely options, when I capitulated on the subject in my last one—especially when one of the options (the first) was to use a separate passage anyway.

[…] But for backwards compatibility and differences of taste, maybe we should officially support both methods of doing many of these metadata things (rather than having two somewhat separate paths of old Twee vs. new Twee processing).

Wait. You want a single extended format? Because that's what your talking about there. If that's the case, then why are we talking about TweeNG at all? I thought something had finally been decided.

Our options, as I see them, are the same as they've ever been:

Extend Twee-v1 in backwards compatible ways. Let's call it Twee-v1.5 or Twee-v2, you get the idea.
Extend Twee-v1 in backwards incompatible ways and not worry about the fallout with older producers or consumers.
Create a new format distinct from, yet similar to, Twee-v1 by extending it in backwards incompatible ways. Let's call it TweeNG.
Create an entirely new format.

When I first created this issue, I was—in some ways still am—in favor of option 1. The result of that was over a years worth of faffing about with very little to show for it. I still think that it's the easiest thing we could do, but getting any kind of consensus on how to extend the format—really just the passage headers—was a Sisyphean task.

I don't think anyone has ever seriously considered option 2—it's mostly there for the sake of completeness.

Somewhat less than three weeks ago, it seemed like we came to a rough consensus that something like option 3 was the best course of action—see my first post on that path from ~3 days ago. We have, I thought, been attempting to roll that particular boulder since.

I don't think option 4 has received much consideration—it's also mostly there for the sake of completeness.

So, where the hell actually are we in this process? Because I do not, apparently, know any more—if I ever did.

PS: I don't want to be the bad guy here, but I am a hairsbreadth away from pulling a Dan-Q and just doing whatever the hell I want with Tweego. I probably should have simply accepted the old maxim about design-by-committee and done that in the first place. My attempt to be inclusive here and not run roughshod over parties with a vested interest in Twee has left me feeling foolish—I do not enjoy feeling that way.

greyelf commented 5 years ago

@klembot

PIDs are assigned at publish time, not while a story is being edited...

I'm unclear what you mean by "at publish time" because if I create a new project and add some passages to it then close that project (without using the Test or Play or Publish... options) the tw-passagedata elements within the saved project HTML file will have PIDs assigned to them.

tmedwards commented 5 years ago

@greyelf Chris can answer for himself and I don't want to second guess him, however, I believe by "publish time" what he really meant was any time the internal story representation is marshaled into an HTML chunk.

For the browser-based version that's on: Play/Test, Publish, and Archive.
For the app-based version that's on: Play/Test, Publish, Archive, and the individual project files.

klembot commented 5 years ago

Correct. The key thing here is that the Twine editor doesn't make any effort to maintain consistency of PIDs between publishes... as far as the editor is concerned, they are there to signal what the start passage is, and that's it.

videlais commented 5 years ago

Catching up on things after being sick and traveling...

PIDs

I'm glad PIDs were covered. I wasn't entirely sure how it all worked internally myself, and it's good we have this here. Thank you, @greyelf and @klembot for clearing that up for at least me.

Path Forward

@tmedwards wrote

Create a new format distinct from, yet similar to, Twee-v1 by extending it in backwards incompatible ways. Let's call it TweeNG.

This is what I'm interested in trying to hash out and where I thought we were as well. My feeling, at least as of the last time I was able to track everything here, was that we had agreed using a StoryData passage to cover metadata stuff (see StoryData section below).

Passage metadata would also be encoded using inline TOML-like format that covered possible Twine 2 metadata as well as the ability to add more in the future if other tools wanted.

To sort-of answer @mcdemarco: We want the readability of Twee but were not planning for Twine 1.4 compatibility. We know this will break tools which depend on this format, but we are doing the breaking from the past with the hope of greater future interoperability between tools in the Twine ecosystem.

StoryData

I agree on losing creator and creator-version from possible values.

I'd like to keep format and format-version for readability purposes. A personal concern I have is the ability to read the format as a plain-text (-ish) version of a Twine project. Right now, that's how we are using it in the Twine Cookbook, and I think having the format and its version would really help there in quickly identifying the exact version of a particular story format (since there are often multiple official versions supported in just Twine 2 itself, for example.)

StorySettings

If I'm tracking here, TweeGo was using StorySettings because Twine 1.4 did. If we are not planning for backward compatibility, we can lose StorySettings since TweeGo "isn't married to it." ("Lose it," that is, as in allow people to use it, but not make it a required passage.)

`Start` and 'Startnode`

@tmedwards has voiced for keeping Start in the specification as the official, well, start passage. There is also considerable history (coming up on at least nine years!) of people using Twee this way due to Twine 1.X output.

Twine 2 supports 'Startnode', which could be an optional metadata entry to make a passage the start passage.

Is there a reconciliation or compromise that makes sense here? Maybe a metadata entry that marked or a passage or held an alternative name?

Including external files? (was `StoryIncludes` and/or `@include`)

Months ago, @tmedwards reminded us that Twee 1.4 and Twine 1.4 were using StoryIncludes because their code was mostly the same but that it remained "hacky." @greyelf rightly pointed out that using @includes opens up all manner problems we want to avoid.

As part of this TweeNG conversation, do we want to write into the specification a way to include external files? I like the idea of it, but I can just as easily see us leaving that up to tools to figure out during compilation process (i.e. TweeGo includes files in the same directly, Twine 2 might one day include packages or something, maybe).

mcdemarco commented 5 years ago

@tmedwards I'm not sure what the decision process is here, nor what decisions have actually been made. I think much of the problem with settling on a format is not having kept track anywhere of any decisions made or points of agreement in general. In this particular case, you seemed to be against TweeNG when you named it that, and insofar as @videlais was in favor of it that seemed to spring from a misunderstanding of what could or couldn't be preserved from the original Twee format. So I suggested moving discussion of an entirely new format to an entirely new issue. No one did so, and so it remained unclear how "new" a format we were or needed to be talking about.

I think this thread is getting hard to follow and a more structured document would be a better approach to identifying what does and doesn't remain to be resolved.

videlais commented 5 years ago

I agree that this thread has gotten confusing, @mcdemarco. To combat that issue, I've tried to provide summaries of what I've seen and how progress is being made. Here is, to my knowledge, where we stand (although I may be wrong):

Resolved:

Create a new but very similar format to existing Twee. The creation of this format MUST do the following:

Retain at least the readability of base Twee
Encode passage metadata
Encode story metadata

Mostly Resolved:

Breaking from Twine 1.4.
- Passage metadata could be encoded using in-line TOML-like formatting.

:: Start {pid="1", position="100,100", size="100,100"}

Story metadata could be encoded using TOML-like formatting inside a passage labeled StoryData which would contain IFDB, title, and possibly more information.

Still Under Discussion

The difference between Start and 'Startnode'. Twine 2 is not going back to Start; that ship has sailed. Assuming we give up on that for this new version of Twee, the 'startnode' (or similar idea) key-value pair could be encoded in the proposed StoryData passage.

As for the decision process, well, it's kinda design-by-committee with a everyone-loses-something compromise system. It can be, at times, a "who is willing to keep kicking this can down the street" process, too. tmedwards opened this thread in October 2017. It's now March 2019.

Twee is part of the Twine ecosystem. I'd personally like it to be part of Twine 2 in the future. People are already using Twee in Twine 2; however, for it to be officially be part of the tool, there needs to be a clear path of both exporting (which exists now) and importing (which doesn't). Much of my personal involvement in the last seven months has been in figuring out what a Twee-like format would look like that supported Twine 2-metadata and allowed clear interoperability between visual and command-line tools.

We are much closer to a nobody-is-completely-happy compromise than I have ever seen us at any other point. It's worth trying to figure out the known unresolved parts and then talking through any others. If we want to move to another thread, let's do that. If a specification document would help, I can post what I have written up at some point. This has not been an easy process, and I've also felt more than once about giving it up completely. But we have gotten closer -- and I think we can pull this off.

tmedwards commented 5 years ago

I've added task lists to the first post, hopefully that will help keep everyone on the same page with where we're at. If you see something I've missed, poke me to add it.

I've also add some example representations of what a TweeNG files with JSON and TOML encoded metadata could look like (see: json_encoded_metadata.twee & toml_encoded_metadata.twee). They were both generated by decompiling a Twine 2 generated project with Tweego (unreleased and uncommitted code).

Some thoughts since I last posted:

Decide on a name for the story metadata passage.

I'm not in love with StoryData as the name of the story metadata passage. It makes it sound like it's far more interesting than it is. I wasn't in favor of StoryMetadata, but maybe it's a better fit. Other ideas?

Decide on a metadata encoding format.

TOML? After working with it for a while, I've become convinced that TOML may not be the right choice. It does have some strong points: expressive enough for our needs, relatively easy to edit by hand, relatively small. It also has some downsides: tables could be confusing to end-users in various ways, inline tables (for passage metadata) need massaging to make work, not yet stable (v0.5 is the latest). Additionally, most tooling I've seen seems mostly geared around decoding it, so we may not get the best control over the encoding representation—since this is mostly going to be for things the compilers are generating, that could be a problem. Finally, if we ever hope that one day Twine 2 will be able to import TweeNG files, it would be another dependency for Twine 2 to take on.

JSON? Good points: expressive enough for our needs, relatively easy to edit by hand if you indent it out (though still less so than something like TOML), relatively small, stable. Downsides: syntax is less forgiving than we might like. Representations should be fairly controllable—mostly thinking of indenting it out for the story metadata. It's already available to Twine 2, so there's no additional dependency there.

~~YAML? Ha ha, just kidding. No.~~

Decide which attributes are part of the (current) story metadata.

Twine 2 bits and bobs, seems pretty straightforward:

IFID: ✔️ n.b., Treaty of Babel, though Twine 2 story data chunks do not include the usual comment syntax.
Options: ✔️ n.b., Only used currently to signal Test Mode (via debug). @klembot I've assumed that the options content attribute is, in essence, a space delimited list of boolean flags?
Tag colors: ✔️
Zoom level: ❓ Why the hell is an editor setting even part of project files?

Decide which attributes are part of the (current) passage metadata.

PID: ❌
Position: ✔️
Size: ✔️

Decide what happens on a metadata decoding error.

🤷‍♂️ My gut says fatal error, but dropping the metadata and attempting to continue would probably not be the end of the world.

Aside (Tweego): Depending on what's lost, Tweego could still refuse to compile the project. When compiling to a Twine 2 style story format, an IFID is a hard requirement.

Decide on a starting passage mechanism.

Passage named "Start": ✔️ Easy, what existing Twee users are used to.
Attribute in story metadata: ❌ I'd really rather end-users not be required to either create new, when starting from scratch, or edit existing story metadata—especially the former.

videlais commented 5 years ago

StoryData

I don't love StoryData either, but it matches what Snowman and Harlowe creates: <tw-storydata>. If we are thinking that attributes of that element are what goes into StoryData, the names would match up. (I know SugarCube doesn't use <tw-storydata>, but there is at least some president.)

JSON

I also think TOML has potential for things, but, yeah, JSON is probably the winner if only because it would not add any additional dependencies and makes more sense for the command-line crowd. If you are using a visual tool and exporting

Required versus optional metadata

Looking at the JSON example, I started thinking about required versus optional metadata.

What if we stated that the only required metadata attribute is the IFDB? That is, a user can add more if they want, and a tool (like Twine 2) may even use its own, but the only required attribute is the IFDB.

:: StoryTitle
JSON Encoded Metadata Example

:: StoryData
{
    "ifid": "D674C58C-DEFA-4F70-B7A2-27742230C0FC"
}

:: Start
The starting passage of this example.

If a tool encounters extra attributes, it ignores them. That way, an export from, say, Twine 2 might have any number of extra metadata it encoded, but when compiled by Tweego, as an example, it would encode a Harlowe or Snowman's values back into <tw-storydata> without doing anything to them -- or just ignore them if the story format wasn't looking for them.

Starting Passage

We go with Start. The advice we give, should this go into a formal document, is that the tool signal to the user that other compilers may expect a passage named "Start" and it is advised to have one and it be the beginning passage. People can then ignore this warning, if wanted.

Passage Metadata

Similar to the story metadata, we state that the only requirement is the name of the passage. Tags and additional metadata (encoded in JSON) are optional.

Tags follow the current rules as space-delimited within square brackets. In a formal document, if there is one, we recommend that the tags 'script' and 'stylesheet' be used for JavaScript and CSS accordingly, but that, like Twee2, a compiler can add and act on any tags any way it wants.

Passage metadata follows the tag section with the only strict rules of it being valid JSON. Tools are free to encode whatever they want here. Tools could, for example, choose to encode these values into their <tw-passage> elements as attributes or ignore them completely.

Non-interference Rule

I think I agree with the stop-and-drop rule as it comes to metadata errors. If a tool encounters an error, it stops, sends a warning (to the user or system), and drops the rest of the metadata. It could then continue to try to parse the rest of the file.

Something I'd like to add is the non-interference rule. Basically, users and tools are free to add to story or passage metadata. As long as it is valid JSON, the compiler passes it along into the final product (as long as the story format supports it). If, as a user, I want to add hundreds of entries in my story metadata, I'm free to do so. The compiler might register a warning, maybe, but it should pass it along unchanged.

Is this too out-there? Should we be strict about defining metadata?

tmedwards commented 5 years ago

I'm pretty sure I'll get rambly here at some point, so my apologies in advance.

Also, about the examples. In an attempt to get them exported, I forgot to add various fields that we obviously should be carrying over—e.g., the story format. Mea culpa.

StoryData

@videlais I'm unsure where you got the either of the ideas that Snowman and Harlowe create the <tw-storydata> chunk or that SugarCube doesn't use it, but both are incorrect. Compilers of Twine 2 style story formats create the entire story data chunk and inject it into story formats—the latter being entirely passive participants in the process. Thus, all compiled HTML files built from Twine 2 style story formats will contain and use the <tw-storydata> chunk—even SugarCube projects.

Maybe you're thinking of Twine 1 story formats that do contain the outer wrapper of the Twine 1 style story data chunk and it's only the passage contents themselves—e.g., <div tiddler>—which are created and injected into the story format.

Back on target. I'm still not loving StoryData, even though I initially suggested it, but aside from a better suggestion, and since you've apparently settled on it, let's call it tentatively agreed?

JSON

👍 Agreed.

Required versus optional metadata

What if we stated that the only required metadata attribute is the IFDB?

👍 Agreed. I'm okay with only requiring the IFID, as that mirrors Tweego's existing behavior for Twine 2 style story formats.

That is, a user can add more if they want, and a tool (like Twine 2) may even use its own, but the only required attribute is the IFDB.

There are a few issues with that:

Neither Twine 2 nor story formats will do anything with the, to them, unknown data, nor will they expose it to users. For story metadata within story formats, users could write custom code to dig out the extra data, but that's more work than simply adding it to JavaScript or a regular passage. More or less ditto for passage metadata.
I don't think we should be pushing new/unknown attributes into Twine 2's story data chunk. (more on that below)
That could be an issue for some strongly typed languages, where your en-/de-coder often takes a struct of some kind with definite fields—unknown fields would be ingnored. I'm pretty sure I could make it work in Tweego, though it would probably be ugly.

Something related that we haven't really discussed yet is consistent representations. We need to have a consistent representation for both story and passage metadata, elsewise one TweeNG compiler won't understand files created by, or to work with, another.

Let's use zoom level as an example since it's simple. Tweego currently parses the zoom content attribute—zoom="0.25"—and converts the value into a float, to match its actual representation in Twine 2, so its JSON representation will be as shown in the example file—"zoom": 0.25. Even though it means nothing to Tweego, it's still parsed into and encoded as an actual number because that's what it's supposed to be. If another compiler chooses not to parse the content attribute at all and simply re-encodes it as a string—e.g., "zoom": "0.25"—which is how it's represented as a content attribute, then the two compilers will produce incompatible metadata representations and be unable to communicate.

We must specify the representations—types mostly—of the metadata.

If a tool encounters extra attributes, it ignores them. […]

👍 Agreed.

[…] That way, an export from, say, Twine 2 might have any number of extra metadata it encoded, but when compiled by Tweego, as an example, it would encode a Harlowe or Snowman's values back into <tw-storydata> without doing anything to them -- or just ignore them if the story format wasn't looking for them.

Ignoring them would, I'd say should/must, be the normal response. The alternative is to require compilers to be constantly looking for unknown metadata just so it can be carried over. That does not strike me as something we want to do.

Starting Passage

👍 That's obviously what I favor, so agreed.

That said, and in all fairness, I did intend to add a starting passage field—e.g., "start":"My starting passage"—to the examples as a what if and simply forgot to code that into Tweego (v2.alpha) for export. The main point of the field would not be to allow end-users to specify the starting passage manually within StoryData, though it could be used for that, but to allow decompiled Twine 2 projects to be immediately recompiled—i.e., without requiring users to either rename their starting passage or using whatever method compilers offers to select the starting passage. It's not something I'd favor, but it's probably friendlier, so I figured that I'd mention it as something we should consider.

Passage Metadata

Similar to the story metadata, we state that the only requirement is the name of the passage. Tags and additional metadata (encoded in JSON) are optional.

👍 Agreed.

Tags follow the current rules as space-delimited within square brackets. In a formal document, if there is one, we recommend that the tags 'script' and 'stylesheet' be used for JavaScript and CSS accordingly, but that, like Twee2, a compiler can add and act on any tags any way it wants.

The script and stylesheet tags should be a requirement, rather than a recommendation, as we need a consistent way to denote script and style sections as a basic/core feature. Compilers should be allowed to offer other ways to include JavaScript and CSS if they wish—obviously Tweego does—but we need a bare minimum that all compilers must support.

Also. AFAIK, the only scripting mechanism Twee2 offers is the script tag—it also offers the coffee tag, but that's used in conjunction with the script tag to signal that the script's source is CoffeeScript rather than JavaScript. Am I missing something?

Passage metadata follows the tag section with the only strict rules of it being valid JSON. Tools are free to encode whatever they want here. […]

I'm not sure how I feel about declaring a free-for-all in the metadata. By itself that doesn't seem like it would be useful.

[…] Tools could, for example, choose to encode these values into their <tw-passage> elements as attributes or ignore them completely.

I'm okay with the latter.

I'm fairly sure that I am not okay with the former. I don't think we should be pushing new/unknown attributes into Twine 2's story data chunk. Twine 2 has, historically, not handled disruptions to its data very well—though that mostly involved missing attributes. Beyond possibly giving Twine 2 fits, there's the matter of round tripping—i.e., if such a story is imported into Twine 2 and it does not keep and re-export/publish the attributes, then that puts us back in the same metadata-leaking boat we're in now, though from the opposite direction.

@klembot What do you think about this?

Non-interference Rule

I think I agree with the stop-and-drop rule as it comes to metadata errors. If a tool encounters an error, it stops, sends a warning (to the user or system), and drops the rest of the metadata. It could then continue to try to parse the rest of the file.

👍 Agreed.

Something I'd like to add is […] users and tools are free to add to story or passage metadata. […] Is this too out-there? Should we be strict about defining metadata?

As noted above, I don't think we should be pushing new/unknown attributes into Twine 2's data chunk, so I do think that we should be strict about defining what metadata will be carried over into the Twine 2 story data chunk. Basically, TweeNG metadata should mirror the Twine 2 metadata.

Perhaps instead of allowing unknown metadata, we could create a feature request on Twine 2's tracker to allow/include extra metadata itself? If eventually added to Twine 2, the new metadata would then be mirrored in an updated version of the TweeNG specification—I mean, we'd need to do that for any new Twine 2 metadata. I'm mostly thinking of passage metadata here—extra story metadata is, more or less, unnecessary as users may already add general data to their projects.

Devil's Advocate sez: Users can already use an object to map passage names and/or tags to whatever data they wish, so the capability to have extra passage data already exists—if not in a user-friendly way. The question becomes then, what would make this user-friendly? I don't think extra passage metadata by itself is the answer, as it's not strictly better than using a centralized mapping. We'd really need something in Twine 2's passage editor UI to back it up. That's not even considering passage tags, which are a lot more useful than most people realize—you can even encode extra data in them, though that falls back into not user-friendly territory.

EDIT: The StoryData passage. I believe it should be a requirement that the StoryData passage must not be output to Twine 2 data chunks—i.e., the data should exist as either a StoryData passage or as part of <tw-storydata>, not both.

videlais commented 5 years ago

`<tw-storydata>`: I was wrong

JSON 👍

Specifications

Here is where I think we are:

Story Metadata Attributes

Required:

ifdb 👍

Optional:

start 👍
zoom ❓
tagColors ❓
format ❓
format-version ❓

Passage Metadata

Required

Passage Name 👍

Optional
Tags 👍
Metadata 👍

Tags

Required

script is used for story JavaScript. 👍
stylesheet is used for story CSS. 👍

Passage Metadata

Optional

Position 👍
Size 👍

tmedwards commented 5 years ago

Okay. I updated the first post.

iftechfoundation / twine-specs