iftechfoundation / twine-specs

Specs related to Twine
70 stars 5 forks source link

[Twee] Extending the Twee format syntax. #1

Closed tmedwards closed 5 years ago

tmedwards commented 7 years ago

There's been talk of extending the Twee format to allow for additional properties to be recorded—e.g. notably the story map coordinates in both Twine 1 & 2—since, at least, as far back as Twine 1.4 was still in development.

Let's discuss what extensions we'd like to see and, naturally, what syntax changes would be necessary to accommodate them.


NOTES: The above task list is not meant to be absolute. There are probably additional items that I'm not thinking of at the moment that should be on the list.

tmedwards commented 5 years ago

Okay. I updated the first post again. @greyelf @videlais @mcdemarco

StoryTitle special passage

Something we have yet to touch on is the StoryTitle special passage.

I'm in favor of leaving it as-is for three reasons:

  1. Easy and what existing Twee users are used to.
  2. Easily mapped to Twine 2's <tw-storydata name>—that's what both Tweego and Twee2 do now.
  3. Necessary for Twine 1 story formats. It may be created at compile time, however, so that's not a blocker.
videlais commented 5 years ago

Here are my thoughts on special passages:

Twee2 doesn't use StorySettings and TweeGo only uses it for IFDB, which will now be in StoryData.

I like the idea of it, but it's far more practical to leave this functionality up to the tools themselves. Twee2 has its @include, TweeGo can include files in the same directory, and might eventually Twine 2 will have some type of package management.

Special Passage Names

Without all the Twine 1.4 names, that should drop the list to the following:

Required:

Optional:

What haven't we covered?

I don't know what zoom does. Are we keeping that?

tmedwards commented 5 years ago

Special Names

I think we should only specifically call out compiler special passages and tags within the specification.

Story Format Special Passages: StorySubtitle, StoryAuthor, StoryMenu

Those are not, and have never been, processed by Twee compilers—even Twine 1 only knew about them tangentially. They do not need to be part of any Twee spec. They will continue to be part of the story formats that use them, and that's not something this spec has the power to change.

As noted above, I don't think the spec needs to mention them specifically one way or the other. If you really wanted to do so for clarification or something, then I'd suggest simply mentioning that story formats may have their own special passage names—ditto for tags.

Compiler Special Passages: Start, StoryData, StorySettings, StoryTitle

I think there's some confusion here. We've never discussed dropping support for the Twine 1 story formats, which seems to be an assumption your laboring under. Tweego supports them and they're still used—mostly Jonah. I might consider dropping support in Tweego's next major release if you'd prefer to purge them, but we'd actually need to have that discussion.

For now, I'll be including the necessary Twine 1 compiler passages here.

Start

Default starting passage.

StoryData

Twine 2 metadata.

StoryTitle

Story name.

Since it's required by Twine 1 story formats and it's what Twee users are used to, I think we're better off just requiring it. Automatically mapping it to/from <tw-storydata name> is not hard.

StorySettings

Twine 1 config and metadata. Bit of a queer duck. It's always been part compiler and story format special passage.

StoryIncludes

❌ Agreed. This isn't something the spec needs to include.

What Haven't We Covered?

Format Name

TweeNG works as a handle here, but we need a real name for this thing. Do we just call it Twee v3—because of Twee2's extended format—or something like that?

File Extension(s)

Existing file extensions:

Assuming we call this Twee v3, then: .tw3, .twee3?

I really don't like those though—can't put my finger on why.

videlais commented 5 years ago

StorySubtitle, StoryAuthor, StoryMenu

Agreed about not specifically mentioning them other than that they are used in Twine 1.

Format Name

I guess Twee v3 works? I don't have a better name at the moment.

File Extension(s)

I can't say I'm a fan of .tw3 or .twee3 either. I considered .t3, but it looks like that already exists.

Any thoughts on this, @greyelf or @mcdemarco?

mcdemarco commented 5 years ago

Regarding file extensions and the format name, I would overload Twee/.tw/.twee rather than add to the Twine version confusion by upping anything to 3 [or 2 for that matter]. (While Twee2e is tempting, no one would get it. I also thought of Tweee or eTwee.)

To clarify that one doesn't mean the old twee, one can refer to extended Twee vs. original Twee, but I don't think there are any cases where accidentally feeding a new Twee file into an old Twee compiler or vice versa is going to result in a code explosion so bad we need a whole new name to avoid it.

greyelf commented 5 years ago

re: File Extensions If it's decided to re-use the current .tw and .twee extensions then the Compilers and the Text Editor Plugins will need to be able to correctly determine which version of 'Twee' is contained within the project's files, so that the project's contents can be correctly validated and processed.

eg. Does the compiler look in StorySettings or StoryData (or both?) for the ifdb, and if the ifdb isn't found which passage does the warning message state the ifdb should be placed in?

mcdemarco commented 5 years ago

Another extension possibility: .twe

@greyelf I think we can safely treat the IFDB as new functionality, since intermediate versions of twee did not agree on where to put it. (Twee2 doesn't use StorySettings per se but a settings tag on its settings passages, and names its settings differently.) So when telling people where to put it the IFDB I'd tell them the extended twee location. If the compiler wants to be backwards-compatible with a particular intermediate twee style--most likely its own--such as putting the IFDB into StorySettings, it can look there for it, too.

The contents of the file itself signal the version: if there is a StoryData passage or new-style metadata on individual passages, it's new twee. If it doesn't have those, does adding a new twee extension make it any less of an old twee file? One could make the twee "version" an explicit compiler flag to accomplish something in particular, such as stricter parsing of the new version or laxer parsing of an old Twee file.

We don't absolutely need to spec an extension if we can't come up with one that we like. Twee2, despite recommending .tw2, does not check the extension; you can feed it any text file containing twee code. The extension is more about user convenience--maybe in the Twee2 case to remember there are incompatible Twee2 extensions in your file that won't work with the old Twee or in TweeGo. But as an official, mostly backwards-compatible extension of twee, I don't know that the spec really needs to send any warnings like that via a new extension.

tmedwards commented 5 years ago

Format Name

With a shrug by @videlais and no objections, I'm going to start using Twee v3 from now on.

File Extension(s)

The "just a Twee-v1 extension" ship has already been sunk—lost with all hands. Please let the ghosts rest in peace.

Distinct formats whose syntaxes do not clearly disambiguate them—and Twee-v1 and Twee-v3 are not distinct enough—should have distinct file extensions. For example:

:: Foo {}
Twee-v1 passage named "Foo {}".

:: Bar {}
TweeNG passage named "Bar" with an empty metadata block.

While that is a bit of a contrived example, passage names like that have been seen the wild—which reminds me of something else we need to cover—so it's still relevant.

Compilers should not need to analyze every Twee file within a project to guess what they happen to be.

Also. To touch on Greyelf's point. AFAIK, some (many/most?) modern programming/text editors do not allow/offer file analysis to determine the language. Thus, without a distinct file extension, editors won't be able to select the correct language definition file. Now, the worst that could happen in this case is probably improper identification and styling of syntax elements, but it is a consideration.

Addressing recently suggested extensions:

Here are some additional extension suggestions for consideration: (details based on searches)

Of those, .twn gives me the warmest fuzzies—though .tw3 has also grown on me a bit for some reason.

/all-the-pokes @klembot @greyelf @videlais @mcdemarco

Specific Replies

@greyelf

[…] Does the compiler look in StorySettings or StoryData (or both?) for the ifdb, and if the ifdb isn't found which passage does the warning message state the ifdb should be placed in?

StoryData, as specified (by Twee v3).

Only Tweego used StorySettings to hold the IFID and its next version—i.e., the first to support Twee v3—will use StoryData. Since I'd rather not grandfather the use of StorySettings to hold Twine 2 metadata, I'm going to have to migrate users away from it—not entirely sure how I'm going to handle that yet, but it's not going to be fun.

Valid Passage And Tag Names

We should probably spec out either:

For example, the following headers would be ambiguous:

:: Apple ][ [nice job breaking it]
Issue: Ambiguity between passage name and tag block.
Cause: Unescaped tag block metacharacters contained within passage name.

:: Oh Noes! [foo.bar baz:qaz oh[noes]]
Issue: Ambiguity within tag block.
Cause: Unescaped tag block metacharacters contained within tag name.

As above, things of this nature have been seen in the wild, so this is not academic.

Twee v3 File Author-Use Area?

We've talked about this before, but never made a definitive statement.

Twee compilers have historically always ignored the area before the first passage header within Twee files, if any, but it's always been a grey area. I propose enshrining the practice and defining the area as author-use within the specification.

videlais commented 5 years ago

File Extension(s)

When in doubt, go unique. .tw3 (.twee3)

Valid Passage And Tag Names

I can think of two ways to go about this. Either forbidding only open and close square brackets because of the complications, or go much more strict and restrict it to a limited set of special characters.

A problem I can anticipate is that Twine 2 currently allows some really crazy things. For example, consider this bizarre example that is currently allowed (and was exported into Twee).

:: StoryTitle
Passage Testing

:: !@#$%^&*(){}[]|\/<>,.;;"'`~[!@#$%^&*(){}[] ◀️ 🕒 ⚠️ 💤 [[]]]][[]]]]
Why? Just, why?

Twee v3 File Author-Use Area

This would in JSON or otherwise enclosed somehow, right? Something that clues the compiler to ignore the "::" sigil until closed, I assume.

mcdemarco commented 5 years ago

The "just a Twee-v1 extension" ship has already been sunk—lost with all hands. Please let the ghosts rest in peace.

I still haven't seen the breaking changes that would prevent processing a (non-includes, non-extreme-corner-case passage title) Twee 1 file with a Twee 3 processor, so as far as I can tell the new twee is still just an extension of the original Twee. Thus, I don't see the purpose of changing the extensions on files that are already extended twee files. People usually make file extension changes for more significant changes to the format.

But I think the notion that the extension matters at all is largely a Windows thing, so if people want a new, confusing extension for Windows, so be it. I expect other tools to go on ignoring it, and code editors to conflate the two (or three) extensions because there aren't significant differences to handle there, either.

greyelf commented 5 years ago

Twee v3 File Author-Use Area

As this area will contain information that won't be processed (by a standard TWEE compiler) I suggest wrapping it within a block comment, like done in many other languages.

/**
The contents of this multi-line area
is ignored by the compiler and the story format.
{it: "can contain", "whatever information": "you want it to"}
*/

:: Some passage

... I believe doing so makes it more obvious that the contents has no programmatical meaning.

edit: change my first statement to make it clearer which type of 'processing' I meant.

mcdemarco commented 5 years ago

@greyelf I was thinking of putting some YAML there (now that the TOML has moved out and because I sometimes post-process with pandoc), so requiring wrapping it in a comment would reduce the usefulness of the header space. I wouldn't say it's bad advice if the content is actually just a comment, though.

videlais commented 5 years ago

Twee v3 File Author-Use Area

My vote is for enclosing it somehow. I really like the approach of using it for author notes.

If using YAML did not introduce another dependency, I'd consider voicing support for it, but my thinking is that introducing another format into one that is both Twee and JSON at the moment is asking for trouble.

I'll wait to see what @tmedwards writes, but my current thinking is to work through how to enclose it safety. Past that, if people want to add extra formats in their personal files, they can. The compiler will ignore it anyway.

greyelf commented 5 years ago

The compiler will ignore it anyway.

Currently this is more of a side-effect of how the parsing code responsible for finding the individual passages is implemented than a defined behaviour.

videlais commented 5 years ago

File Extensions:

It seems like .tw3 (.twee3) is the winner for platforms (e.g. Windows) and for editors where it matters.

Passage and tag name validity/escaping details

As I highlighted the other day, Twine 2 allows all manner of crazy inputs. Do we simply state that anything is allowed between the sigil (::) and either the next optional area or a newline? Should we be stricter about that?

And, if we do get stricter, how do we handle projects with un-escaped input?

Twee v3 File Author-Use Area

As it exists right now, and as @greyelf points out, there is that side-effect of having the space. If we do nothing, it remains a place where authors can write whatever they want until a line starts with "::".

I'm okay with writing that into a specification: no defined behaviors and authors can use that space for things like comments, YAML, or whatever best helps them.

tmedwards commented 5 years ago

I'm going to start off with a direct reply to @mcdemarco here—though I encourage everyone to read it. I'm likely going to ramble and repeat myself, so my apologies in advance.

Also, let me be clear upfront on one point. I really do not want a new file extension, I simply think we probably need one. There's an important distinction there. I don't think we have any ideal choices here, so we're looking for the least bad one.

I still haven't seen the breaking changes that would prevent processing a (non-includes, non-extreme-corner-case passage title) Twee 1 file with a Twee 3 processor, so as far as I can tell the new twee is still just an extension of the original Twee.

There should be nothing preventing a Twee v3 compiler from reading a Twee v1 file, since Twee v3 largely is in practice, if not currently defined that way, an extension of Twee v1. I've never said differently, so I'm unclear where you got that.

The reason to clearly differentiate the formats in some way is for two reasons:

  1. So no one feeds a Twee v3 file with all the trimmings—i.e., Twine 2 passage metadata—into a Twee v1 compiler, because that is not going to turn out well. Like it or not, Twine 1.4 is still in use and that's unlikely to change anytime soon. Also, like it or not, a different file extension will help prevent the issue.
  2. The ambiguity that I literally gave an example of in my last reply. This is, in fact, an existing problem with the Twee v1 format—i.e., tag block metacharacters within the passage name—it's simply worse now because we've added a new optional block with its own metacharacters.

The latter is what we're talking about here, but the former should not be forgotten.

Anyway. Is it a corner case? Definitely, though I'm not sure what that has to do with anything—my point being that shooting yourself in the foot only very occasionally is not a reason to attempt to avoid shooting yourself in the foot at all. I have actually seen authors doing things of that nature in the wild—as I noted before, this is not academic—so I don't think it will be as extreme of a corner case as you're making it out to be. Even if it does turn out be vanishingly rare, do we really want to just ignore the issue, so that when it eventually does bite some author, who'll have absolutely no clue about why things aren't working, all we can say is oops?

I feel that we should attempt to resolve the issue or it will eventually bite authors in the ass. The options I can see are: (please point out any I've missed)

  1. Ignore the issue and let users blunder into it. It will eventually happen, but it should also only be a very rare occurrence.
  2. To signal which format is in-play for the decoders. Meaning a file extension, because we don't have a header with a magic number/signature—i.e., how binary files which allow different formats within the same container generally work—and I don't think we want to add one.
  3. To either restrict the characters allowed within passage names, specifically to exclude the passage header optional block metacharacters, or define an escape mechanism. Both will break any existing passage headers whose names already contain the metacharacters. The latter would also break compatibility with compilers that aren't updated to understand the escape mechanism—i.e., Twine/Twee 1.4 and, maybe, Twee2.

Note: I didn't list having the decoders do an analysis pass on each Twee file in an attempt to guess which format is in-play before beginning the actual decoding pass, since that's subject to the same ambiguity and isn't notably better than simply taking each passage header as it comes.

None of those options are perfect. I just think that option 2 is the least bad available to us. If you have a better idea, then please share it—I mean that honestly.

We're probably going to need to do a version of option 3 anyway—see the Passage Header Parsing section below. If we do, then by keeping the formats as distinct entities, rather than having v3 simply be a backwards compatible extension of v1, at least we aren't breaking existing files and/or older compilers—well, unless someone forcibly feeds a v3 file either to v1 compiler or to a v3 compiler as a v1 file, but there's literally nothing we can do about that.

Thus, I don't see the purpose of changing the extensions on files that are already extended twee files.

What "extended twee files" are you talking about? There are no Twee v3 files in existence yet.

If you're referring to files in the Twee2-extened format, then they'd keep the recommended extension. If people were ignoring Twee2's recommendation, then that's not really our problem. I'm assuming Dan-Q made that recommendation for the first reason I listed up at the top of this—i.e., an attempt to prevent people from feeding files containing the Twee2 optional position block into Twee v1 compilers.

But I think the notion that the extension matters at all is largely a Windows thing, so if people want a new, confusing extension for Windows, so be it.

It's not, but I don't have the energy to argue the point. I'm not saying file extensions are the best thing since sliced bread, but they do have a place.

Also. You're sounding a bit lot like an OS snob with an axe to grind here. I thought we'd finally left that drivel behind for good, but apparently I'm wrong—not the first time, won't be the last.

A personal anecdote. My formative years with computers, and most of my professional career thus far, was spent chiefly on UNIX systems, so I'm certainly aware that file extensions are not the end-all-be-all and aren't absolutely needed in many cases—e.g., most image software, even on Windows, usually scan for a header signature to know what they're decoding, so a missing or incorrect file extension isn't an issue. I say this not for personal aggrandizement, but simply to make clear that my feeling that the least bad option available to us is a new file extension does not come from a Windows-centric perspective, which seems to be what you're implying here.


File Extensions

I can live with .tw3/.twee3.

I'm also open to anyone changing my mind about a new file extension being the least bad option—I'm being honest here, no flippancy included. What I am not open to is leaving a hole that authors can trip into, regardless of how rare that event might actually be. Explain to me how to fill in the hole, without breaking something else, and I'll happily stop beating this drum—my arms are tired.

Passage Header Parsing (was: Valid Passage And Tag Names)

I laid this out poorly when I initially brought this topic up.

What we really need is for compilers to be able to parse the headers in a consistent way that, hopefully, matches author expectations. If we can do that solely with some, hopefully simple, rules for the compilers, then that would be best. If not, then we may also need to lay out rules for names—probably defining a way to escape the metacharacters would do—which is less great, but workable.

Do we simply state that anything is allowed between the sigil (::) and either the next optional area or a newline?

By and large, yes. Compilers should—must, really—be properly encoding anything which needs to be for HTML outputs—and decoding on the flipside—so it's already not an issue on that end. The compilers themselves really shouldn't care what characters make up passage/tag names as long as they can actually parse the header.

There's a tangent here about how you'd actually use passage names that contain square brackets via the standard link markup—e.g., [[…]]—but that's separate kettle of radioactive sewage.

Regardless. The point I was, ham-fistedly, attempting to bring up was that allowing the metacharacters of the optional blocks within the passage name can make it impossible to tell where the optional blocks start with certainty. Reusing one of my previous examples:

:: Apple ][ [nice job breaking it]

How do you think that should parse? Can you guess how it currently parses in the various compilers?

This issue also concerns the tag names—e.g., the tag block [nice job[breaking] it]; even Tweego can't make sense of that mess.

Twee v3 File Author-Use Area

My interest here was in enshrining existing behavior—i.e., that decoders jump to the first thing that parses as a passage header—so that authors could do whatever they wanted with the space before the first passage. AFAIK, all compilers currently do it, but it's an unspecified implementation behavior. I simply wanted to define it as part of the spec, so authors could rely upon it.

I'd rather not define a wrapper since that opens many cans of worms—e.g., we'll need to parse the wrapper and care what happens when authors invariably botch it, it would complicate authors using it for structured data as their tooling would need to be able to pull the data out of the wrapper, etc.

mcdemarco commented 5 years ago

Whether or not I'm an OS snob---I don't care much for windows but I use it daily for work---it is the case that other OSes no longer put the focus on file extensions that they used to, and most never did in the case of plain text. Changing the extension without changing the format to an incompatible one may come with some clear benefits on the programming side, but I think it has more, if less obvious, costs on the user end.

For example, in one particular case you brought up, I don't think that changing the file extension is any guarantee that someone won't try to feed an incompatible Twee file into Twine 1. In fact, I doubt it will even lead to any appreciable reduction in the number of twee v3 files fed into Twine 1. (Twine 1 doesn't mention or check the extension or version of "Twee" imports, and all that happens with a twee2 file is that the location data gets interpreted as tags.)

Even if there were legitimate increased dangers to Twine 1 users, at some point we do need to acknowledge that Twine 1 users are doing so at their own risk, well past the sell-by date, and the ass-biting they get for feeding a new Twee file into an antique Twine program is probably only a small part of the problems they will encounter on a road that could have led them there.

Intermediate twee users never had an official version number (whatever Twee2 may have suggested), and are generally competent enough to deal with a somewhat updated format without a new extension. They seem more likely to get annoyed with the requirement to change an extension on a file that has not changed (e.g., in vc), than to fall into some corner-case accident. Both will happen, but since I'm one of the people who will get annoyed more often than having accidents, I'm arguing my side.

Thus, I don't see the purpose of changing the extensions on files that are already extended twee files.

What "extended twee files" are you talking about? There are no Twee v3 files in existence yet.

If you're referring to files in the Twee2-extened format, then they'd keep the recommended extension. If people were ignoring Twee2's recommendation, then that's not really our problem.

All I meant in this case was that every (non-import) old twee file is automatically a twee3 file. Any intermediate twee file without incompatible location values in it is also automatically a twee v3 file (although the IFDB and other odd data might get unexpectedly ignored for misplacement).

As I mentioned before, Twee2 did not require the recommended extension, and people managed to use it without the new extension and without significant confusion. As they frequently do with the various different flavors of markdown, versions of html, javascript, xsl, etc. Changing extensions at this pace over so little actual format change is the unusual choice, and we don't seem to have an unusually good reason for making it.

I apologize for guessing at bad reasons like Windows compatibility, but I was mystified at the inspiration for this choice. Who does change extensions like this, and for plain text formats that are not changing significantly?

greyelf commented 5 years ago

File Extensions and Text Editor Plugins

I understand that currently such TWINE/TWEE related plugins only handle basic syntax highlighting but there should be no technical reason why they couldn't be extended to support more advance features like code folding, code navigation, IntelliSense, and validation of Passage Header structures / StoryData contents.

To know which plugin(s) to automatically load the editor needs to know the current file type, and they generally based that on the file's extension. So the question of whether or not to have a new file extension isn't limited to just the TWEE compiler ability to process the file(s) correctly.

tmedwards commented 5 years ago

(Twine 1 doesn't mention or check the extension or version of "Twee" imports, and all that happens with a twee2 file is that the location data gets interpreted as tags.)

I suppose I'd have to rate your version of Twine 1.4 as subpar then, as the one I'm using definitely does apply a filter to the file list. As to the second bit, I surely hope that you don't consider that normal or acceptable behavior.

All I meant in this case was that every (non-import) old twee file is automatically a twee3 file. Any intermediate twee file without incompatible location values in it is also automatically a twee v3 file (although the IFDB and other odd data might get unexpectedly ignored for misplacement).

What? Who's expecting that? Why would we force people to change the extension of their existing, presumably working, Twee v1 files so that they'd be classified as Twee v3 files? The chief, and pretty much only, benefit in doing so would be to gain the optional passage metadata block. As far as it goes, who's going to be writing it out by hand and into existing files?

One, or both, of us are seriously confused here.

mcdemarco commented 5 years ago

(Twine 1 doesn't mention or check the extension or version of "Twee" imports, and all that happens with a twee2 file is that the location data gets interpreted as tags.)

I suppose I'd have to rate your version of Twine 1.4 as subpar then, as the one I'm using definitely does apply a filter to the file list. As to the second bit, I surely hope that you don't consider that normal or acceptable behavior.

As far as I know this this the behavior of the publicly available Twine 1.4.2 when you import a Twee file. It is certainly the behavior on my mac. I imported a non-twine file with a non-.tw extension, and Twine 1 wisely complained that there were no passages in there. If you're using some custom Twine 1.4.2, or if the behavior on Windows is different, that's even more of a corner a case than the accidental-import-into-Twine-1 case started out as.

And yes, since Twee is a text format, I expect the file chooser to list all text files for me, as it did. It's my problem to decide which one I want to load. This is the normal and acceptable behavior, at least in my OS.

All I meant in this case was that every (non-import) old twee file is automatically a twee3 file. Any intermediate twee file without incompatible location values in it is also automatically a twee v3 file (although the IFDB and other odd data might get unexpectedly ignored for misplacement).

What? Who's expecting that? Why would we force people to change the extension of their existing, presumably working, Twee v1 files so that they'd be classified as Twee v3 files?

Because they want to use Twee v3 compilers on a file that is perfectly acceptable to a Twee v3 compiler. We're proposing a new file extension for twee v3. What is the point of proposing it if people are not supposed, ideally, to use it?

The chief, and pretty much only, benefit in doing so would be to gain the optional passage metadata block. As far as it goes, who's going to be writing it out by hand and into existing files?

The chief advantage is using the latest toolchain without bothering about what is or isn't in their existing files (at least until something turns out to be broken, like a bad character in a passage name, or newly misplaced, like the IFDB code). At that point, yes, they might manually put the metadata where it belongs. Complicating that edit with a filename change is part of the usability cost of extension changes.

greyelf commented 5 years ago

Because they want to use Twee v3 compilers on a file that is perfectly acceptable to a Twee v3 compiler.

But that v1 file isn't perfectly acceptable to a v3 compiler unless it (or one of the other files in the set) contains the required ifdb attribute, and even if that v1 file does contain a ifdb attribute it's unlikely to be located in the correct special passage unless the end-user has done more than simply changed the extension of the file.

mcdemarco commented 5 years ago

Because they want to use Twee v3 compilers on a file that is perfectly acceptable to a Twee v3 compiler.

But that v1 file isn't perfectly acceptable to a v3 compiler unless it (or one of the other files in the set) contains the required ifdb attribute, and even if that v1 file does contain a ifdb attribute it's unlikely to be located in the correct special passage unless the end-user has done more than simply changed the extension of the file.

I don't think most people would feel that new complaints about the IFDB from a compiler merit a change to the file name, especially when TweeGo's old complaints about it didn't. But this case is more about whether the spec requires the IFDB (and thus the passage to wrap it) or not. I'm not sure that question has been explicitly decided/specced.

videlais commented 5 years ago

Passage Header Parsing

I'm proposing escaping metacharacters somehow. In seeing both the examples @tmedwards has used and the one I generated by feeding emoji to Twine 2, I feel we need to make sure authors can label their passages in ways that make sense.

It makes a stronger argument, I think, to propose this new specification with the goals of better encoding passage metadata, story metadata, and the ability to safely escape passage names. Those would, in my opinion, justify a new file extension.

Some Ideas for Escaping:

:: StoryTitle
Passage Testing

:: Start
[[@🧟‍♂️🧟‍♂️]]

:: \@\🧟‍♂️\🧟‍♂️
Double-click this passage to edit it.
:: StoryTitle
Passage Testing

:: "Start"
[[👀]]

:: "👀"
Double-click this passage to edit it.
:: {"name": "StoryTitle"}
Passage Testing

:: {"name": "Start"}
[[@🧟‍♂️🧟‍♂️]]

:: {"name": "\@\🧟‍♂️\🧟‍♂️"}
Double-click this passage to edit it.

As an aside, if we were to use JSON for the passage name escaping, it might make sense to make the whole passage header JSON at that point. It would then match the Story Metadata.

greyelf commented 5 years ago

re: Passage Header Parsing

While I can understand the need for allowing non-Latin characters in Passage Names so that languages other than English can be supported, what purpose is served by allowing meta-characters like 🧟‍♂️and 👀?

eg. The examples provided by @videlais could of easily of been written like so.

:: Start
[[@🧟‍♂️🧟‍♂️|Target Passage Name]]
[[👀|Target Passage Name]]

:: Target Passage Name
Double-click this passage to edit it.
videlais commented 5 years ago

what purpose is served by allowing meta-characters like 🧟‍♂️and 👀?

@greyelf Take that up with @klembot. Those are allowed in Twine 2. You can totally name your passages emoji.

greyelf commented 5 years ago

Have you considered that that's more of a bug / oversight than an actual designed feature, similar to the existence of the Twee 1.x File Author-Use Area

videlais commented 5 years ago

@greyelf: Yes, but it also doesn't really matter at this point. Twine 2 allowed it. And that's also my point: Twine 2 does and has allowed a wide selection of metacharacters. This is not a new thing, and as far as I know, Twine 2.3 doesn't fix this -- even assuming it is a bug.

Twee v3 needs to support this for full compatibility. Even if Twine 2.3 changed things tomorrow, there would still be stories out there with weird metacharacters. It behooves us to find a good way to escape these characters in a safe way.

tmedwards commented 5 years ago

@mcdemarco

As far as I know this this the behavior of the publicly available Twine 1.4.2 when you import a Twee file.

The Windows release filters the file selection. Users can tell it to show all files, but that is very wisely not the default.

And yes, since Twee is a text format, I expect the file chooser to list all text files for me, as it did.

That's neither user friendly, nor particularly relevant. Twee files are formatted data, not simply random text content. What base format that data happens to be in, text or binary, should be irrelevant.

The vast majority of users should not be getting all files thrown at them via a normal file selection dialog. In software where all files are relevant—e.g., archival software—sure. In a file explorer/finder, also sure. In software that deals with files that are specifically formatted data? Hell, no.

It may be what you like, but it's a damnably stupid thing to do for the vast majority of users who, let's be frank, can barely operate their computers half the time—and I'm talking about users of all of the common desktop operating systems here, macOS very much included. The rise of mobile devices, with their simplified/streamlined UIs, has only made this worse, rather than better.

I don't know if the culprit here is the macOS release of Twine 1 not providing filters or the macOS (10) file selection dialog itself lacking them at all—I haven't used a Mac since my SE/30 died, which used OS<10—but it's a damn silly oversight.

Because they want to use Twee v3 compilers on a file that is perfectly acceptable to a Twee v3 compiler. We're proposing a new file extension for twee v3. What is the point of proposing it if people are not supposed, ideally, to use it?

You are the only one attempting to force a situation where existing files must be converted to the new extension and/or new files must use it.

The chief advantage is using the latest toolchain without bothering about what is or isn't in their existing files (at least until something turns out to be broken, like a bad character in a passage name, or newly misplaced, like the IFDB code). At that point, yes, they might manually put the metadata where it belongs. Complicating that edit with a filename change is part of the usability cost of extension changes.

I can agree with that. I never said the new extension was without cost—pretty sure I've noted, or implied at least, that it did come with costs.

Personally, and I've stated this before, do not want a new extension. The primary reasons behind that are twofold: having multiple extensions is less convenient and the vast majority of users are probably going to trip over it. I've also stated, multiple times now that—even keeping those reasons firmly in mind—I still think it's the least bad option available to us.

The only other option is to say to hell with compatibility with Twine/Twee 1.4—and possibly Twee2, if it's never updated. Because no, having them either drop data on the floor or, worse, jamming it into places it does not belong is not acceptable.

Would you like to propose that we consider dropping any and all backwards compatibility considerations? I'm not suggesting that we reopen the table to changing bits we've already reached some kind of grudging consensus on, simply that we wouldn't care about issues with older compilers—essentially that Twee v3 would make no guarantee about using files in the format with older compilers.


@videlais @greyelf @mcdemarco

Passage Header Parsing

I like the backslash mechanism and it's pretty standard.

Also, there seems to be some confusion here. Metacharacters in this context are characters that are part of the format's syntax. In this particular case, I was referring to the opening and closing characters for the optional blocks.

Characters that are not syntactical in the areas in question do not need to be escaped, because they cause no parsing ambiguity—i.e., while kind of silly, emojis are not problematic for parsing.

klembot commented 5 years ago

Just for the record, Twine 2 allows any Unicode character to be used more or less anywhere in the story. It saves files using UTF-8 encoding.

mcdemarco commented 5 years ago

The vast majority of users should not be getting all files thrown at them via a normal file selection dialog. In software where all files are relevant—e.g., archival software—sure. In a file explorer/finder, also sure. In software that deals with files that are specifically formatted data? Hell, no.

It may be what you like, but it's a damnably stupid thing to do for the vast majority of users who, let's be frank, can barely operate their computers half the time—and I'm talking about users of all of the common desktop operating systems here, macOS very much included. The rise of mobile devices, with their simplified/streamlined UIs, has only made this worse, rather than better.

People who can barely operate their computers will not be using a command line process, so it's hardly relevant that you think they can't pick the correct text file out of a directory of their own text files without the aid of an extension.

Because they want to use Twee v3 compilers on a file that is perfectly acceptable to a Twee v3 compiler. We're proposing a new file extension for twee v3. What is the point of proposing it if people are not supposed, ideally, to use it?

You are the only one attempting to force a situation where existing files must be converted to the new extension and/or new files must use it.

No, the addition of a second (technically, fourth) extension for twee files is what will cause users to convert to it. Presumably they will have to change to it if they actually add any new twee functionality to an old twee file, depending on their compiler. In any case, if you and I cannot agree at this point who will need to rename their files because we added a new extension, then users will certainly be confused.

The only other option is to say to hell with compatibility with Twine/Twee 1.4—and possibly Twee2, if it's never updated. Because no, having them either drop data on the floor or, worse, jamming it into places it does not belong is not acceptable.

Like I already mentioned, a new extension is no guarantee that someone won't try to import the wrong thing into Twine 1---I did it, however I managed it, and I can imagine users just changing the extension back if that's what their OS and compiler happen to demand of them---or into any other incompatible compiler. So yes, I guess I am saying to hell with corner cases about what someone might try to import into Twine 1 in some particular configuration of some particular OS that otherwise might have prevented it.

Would you like to propose that we consider dropping any and all backwards compatibility considerations? I'm not suggesting that we reopen the table to changing bits we've already reached some kind of grudging consensus on, simply that we wouldn't care about issues with older compilers—essentially that Twee v3 would make no guarantee about using files in the format with older compilers.

I don't think so, though I'm not entirely clear on what you're suggesting here. We would have to explicitly spec a breaking change to "force" a v3 compiler to reject a v1 twee file, and even then the compiler could be more forgiving than the spec. Except in particular cases like the proposal for searching for a start passage, we have not tried to spec any compiler behavior, never mind guarantee any.

Either way, I am merely proposing not changing the extension, because it is extremely unusual to change an extension for a plain text format---even somewhat structured ones (most are), even when users might have difficulties because there are various, semi-incompatible, potentially data-dropping versions of the format (e.g., markdown), and even when the format does not declare a version anywhere in it (which we could add, but haven't). And that unusualness can lead to its own confusion: in most cases where there are multiple extensions (markdown again, though the 2 original Twee extensions also qualify), they are normally interchangeable and signify nothing in particular about the file contents.

In addition to being non-standard, the extension change is at best annoying and at worst confusing to the users. That, to my mind, far outweighs the pros that have been mentioned thus far.

tmedwards commented 5 years ago

People who can barely operate their computers will not be using a command line process, […]

My experiences, personally and professionally, do not bear that out, at all—and not simply with the command line tools I've written. No, in my experience people will chase whatever has caught their fancy, regardless of how utterly unprepared they are deal with it.

Hell. All one has to do is look at damn near any help channel/forum to see it writ large.

So yes, I guess I am saying to hell with corner cases about what someone might try to import into Twine 1 in some particular configuration of some particular OS that otherwise might have prevented it. […] I don't think so, though I'm not entirely clear on what you're suggesting here.

You seemed clear enough if the first sentence I quoted there is anything to go by.

@greyelf @videlais Opinions about dropping the new extension and letting old compilers suck it?

videlais commented 5 years ago

Opinions about dropping the new extension and letting old compilers suck it?

I've generally thought that having a new file extension helps the short-term much more than the longer term. There's a psychological aspect to a "different" file type even if the content is very similar. Marking that helps show people that the file is related to a new usage.

Now, will people try to feed the wrong versions of things into programs or with options that don't support them? Yes. Absolutely, yes. They will. This will happen.

Part of what I've tried to articulate is a rhetorical argument for a new type using the three pillars of better passage metadata, better story metadata, and safely escaping passage names. For those who care about working (potentially) between Twine 2 and other tools, these are strong selling points. Being able to move between tools with full compatibility would really help the ecosystem.

I'm not sure Twee2 is going to change. TweeGo will. And, I really hope, so will Twine 2. So, to be honest, I'm for a new type simply because it won't outright break existing tools in the short term. We will need to mark it in the specification, in, say, Twine 2 if it adopts this, and in TweeGo. However, if people are using existing Twee v1, they can continue to do so.

Confusing Users

Will multiple versions potentially confuse users? Yes, probably. Will it confuse users more to keep a single format but change its contents? I think so, yes.

I agree with tmedwards that adding a new file type is the better of bad options. It's not ideal, but it's definitely the better of the bad.

Marking Version in Header Area

mcdemarco has implied that this might help people. I'm not strongly against it. We'd be adding another thing people would have to type (or have a generator program write for them, I guess), but if this would be a good compromise, I'm for including it somehow.

Passage Header Parsing

I'm fine with the backslash escaping. It's pretty standard in other programs and would only affect those who want to really use certain special characters in their passage names. Hopefully, a small sub-set of overall the user base.

videlais commented 5 years ago

File Extension

I want a new extension, but I'm willing to list it under a category of HIGHLY RECOMMENDED instead of REQUIRED. This would allow parsers and compilers to decide if they want to accept other file extensions or not.

Passage Header Parsing

Escaping of both passage names and their tags for special metacharacters is a necessity. Sounds like blackslash escaping would work?

Twee v3 File Author-Use Area

tmedwards commented 5 years ago

With over a week gone by since the last bit of brouhaha, I think it's time to kick the ball again.


I think we've reached consensus on both of the following:

Yes?

File Extension

With no additional comments, I'm going to assume that no one else has an opinion on this matter.

So, where do we stand?

That sound about right?

I don't really see the point in drafting an optional file extension, so why don't we simply drop the idea.

videlais commented 5 years ago

Seconded for dropping file extension.

videlais commented 5 years ago

Passage and tag names must escape the optional tag and metadata block opening/closing metacharacters—i.e., [, ], {, }.

In order to build some testing edge cases, I've written my own implementation of a Twee compiler using the current version of the specification. In working through various parts, I'm interested in how we see this particular part working out in practice.

So, right now, my approach is to read through the name and tags section of each passage header and replace all of the escaped special metacharacter combinations ("/[", "/]", etc.) with their hexadecimal equivalent. Then, when writing out to HTML, flipping them back. It's... workable, but maybe not the best.

I mention all that because one of the things I'm hoping to bring out this Twee specification discussion is some documentation on some possible ways to go about parsing Twee and Story Format files. (Very helpfully, for example, @tmedwards left documentation on how Harlowe does a weird thing and JSON parsing will break.)

Other than me, has anyone else here worked through an approach to this? I'm just curious.

(I'm planning on eventually releasing my own compiler, but probably not till at least another few weeks. By which point, I hope, this specification will out there for people to reference and use for their own work.)

tmedwards commented 5 years ago

I've got a bad case of the flu—fat lot of good my immunization did /sigh—so I'll apologize in advance for any rambling, confusion, or mistakes on my part.


@videlais

So, right now, my approach is to read through the name and tags section of each passage header and replace all of the escaped special metacharacter combinations ("/[", "/]", etc.) with their hexadecimal equivalent. Then, when writing out to HTML, flipping them back. It's... workable, but maybe not the best.

I'm not sure why you're doing that internally. Unless I'm misunderstanding what you mean, that seems like an odd thing to do. The internal representation of the data should not need to be encoded at all. You only need to de-/en-code at the boundaries. In other words:

Example

The following <tw-passagedata> sample from Twine 2:

<tw-passagedata pid="1" name="{S}napple ][e" tags="[foo] &lt;bar&gt; {baz} xy=zzy &amp;arf &quot;fnord&quot; &#39;zugzug&#39;" position="860,403.5" size="100,100">This was only a test.</tw-passagedata>

Should decode into the internal form: (pseudocode only, not meant to be a structural suggestion)

// NOTE: Using the backquote as the string delimiter.
Passage_Record {
    name     = `{S}napple ][e`,
    tags     = [
        `[foo]`,
        `<bar>`,
        `{baz}`,
        `xy=zzy`,
        `&arf`,
        `"fnord"`,
        `'zugzug'`,
    ],
    position = `860,403.5`,
    size     = `100,100`,
    content  = `This was only a test.`,
}

And should encode to Twee v3 as follows:

:: \{S\}napple \]\[e [\[foo\] <bar> \{baz\} xy=zzy &arf "fnord" 'zugzug'] {"position"="860,403.5","size"="100,100"}
This was only a test.

Notes

  1. The reverse transformation should also hold true.
  2. When writing HTML I'd suggest that compilers: a. Only encode what's actually required by the HTML specification. For quoted content attribute values, and I do suggest quoting if you're writing the HTML chunk manually, the HTML specification only requires that the quoting character and ambiguous ampersands be encoded. That said, encoding all of the usual suspects—&, <, >, ", '—would also be fine, since you need to encode the passage contents that way anyway—meaning that you could use a single HTML encoder for both. b. Use the commonly used character references—i.e., &&amp;, <&lt;, >&gt;, "&quot;, '&#39;—to head off issues with poorly written tools.
  3. The ordering of the content attributes within the HTML <tw-passagedata> tag and the properties within the Twee JSON chunk should not be taken for granted—i.e., the ordering should not be guaranteed and compilers should not choke on alternate orderings.
videlais commented 5 years ago

Let me see if I can clear up what I'm trying to explain.

I created an "EscapeTesting" example in Twine 2. Published it.

<tw-storydata name="EscapeTesting" startnode="1" creator="Twine" creator-version="2.2.1" ifid="647F3E16-941C-41C2-BD3F-3A92CAB15B5F" zoom="1" format="Harlowe" format-version="1.2.4" options="" hidden><style role="stylesheet" id="twine-user-stylesheet" type="text/twine-css"></style><script role="script" id="twine-user-script" type="text/twine-javascript"></script><tw-passagedata pid="1" name="Start" tags="" position="104.5,103" size="100,100">[[Ano\[th\[er\}\]passage]]</tw-passagedata><tw-passagedata pid="2" name="Ano\[th\[er\}\]passage" tags="\{\}\[\]" position="297.5,101" size="100,100">Double-click this passage to edit it.</tw-passagedata></tw-storydata>

I feed it to my HTML parser and it spit out the following:

:: StoryTitle
EscapeTesting

:: StoryMetadata
{
    "ifid": "647F3E16-941C-41C2-BD3F-3A92CAB15B5F",
    "format": "Harlowe",
    "formatVersion": "1.2.4",
    "zoom": "1"
}

:: Start {"position":[104.5,103],"size":[100,100]}
[[Ano\[th\[er\}\]passage]]

:: Ano\[th\[er\}\]passage [\{\}\[\]] {"position":[297.5,101],"size":[100,100]}
Double-click this passage to edit it.

:: UserStylesheet [style]

:: UserScript [script]

The problem is when I want to read the Twee output back into the program to re-produce the HTML. That's where things get messy.

What I think you are writing is that this is much more of a weird edge case, right? If I was moving from Twee to HTML, I'd be unescaping the special characters. Coming the other way, I'd be escaping them if they showed up -- adding the backslash or, as you mention, doing the conversion.

Here, the already-escaped characters should have been, themselves, escaped somehow. Right?

tmedwards commented 5 years ago

@videlais You have several issues in your example. Some are my fault—I was neither clear nor explicit enough; the escapement section of the opening post has been updated (hopefully, it's better now). Some are your fault—you clearly didn't pay attention to the names and types of some of the attributes/fields; compilers cannot vary on those.

Here, the already-escaped characters should have been, themselves, escaped somehow. Right?

More or less. Instances of the escapement character within raw text also need to be escaped to avoid ambiguity. I'd been inadvertently making the assumption that was obvious, when it clearly was not—my apologies.


Reusing your example Twine 2 data chunk:

<tw-storydata name="EscapeTesting" startnode="1" creator="Twine" creator-version="2.2.1" ifid="647F3E16-941C-41C2-BD3F-3A92CAB15B5F" zoom="1" format="Harlowe" format-version="1.2.4" options="" hidden>
    <style role="stylesheet" id="twine-user-stylesheet" type="text/twine-css"></style>
    <script role="script" id="twine-user-script" type="text/twine-javascript"></script>
    <tw-passagedata pid="1" name="Start" tags="" position="104.5,103" size="100,100">[[Ano\[th\[er\}\]passage]]</tw-passagedata>
    <tw-passagedata pid="2" name="Ano\[th\[er\}\]passage" tags="\{\}\[\]" position="297.5,101" size="100,100">Double-click this passage to edit it.</tw-passagedata>
</tw-storydata>

It should transform into something† like the following Twee v3 notation‡:

:: StoryTitle
EscapeTesting

:: StoryMetadata
{
    "ifid": "647F3E16-941C-41C2-BD3F-3A92CAB15B5F",
    "format": "Harlowe",
    "format-version": "1.2.4",
    "zoom": "1"
}

:: Start {position="104.5,103",size="100,100"}
[[Ano\[th\[er\}\]passage]]

:: Ano\\\[th\\\[er\\\}\\\]passage [\\\{\\\}\\\[\\\]] {position="297.5,101",size="100,100"}
Double-click this passage to edit it.

† I say "something" because there are things that cannot meaningfully affect (de)compilation, so compilers may vary on. Examples: How many line breaks go between passages, is whitespace added between the passage header mark (::) and the passage name, what the Twine 2 Story JavaScript and Story Stylesheet sections are named within Twee notation, if empty Twine 2 Story JavaScript and Story Stylesheet sections are written to Twee notation at all, etc.

‡ I wrote that by hand, so this assumes I didn't screw it up. You should get the idea regardless.

greyelf commented 5 years ago

I am seriously not looking forward to having to explain to an Author (writing TWEE) that is having issues with a particular passage name or passage tag why they need to manually change it to something that looks like any of the following.

:: \{S\}napple \]\[e [\[foo\] <bar> \{baz\} xy=zzy &arf "fnord" 'zugzug']

:: Ano\\\[th\\\[er\\\}\\\]passage [\\\{\\\}\\\[\\\]]

[[Ano\[th\[er\}\]passage]]
videlais commented 5 years ago

@tmedwards:

Instances of the escapement character within raw text also need to be escaped to avoid ambiguity.

I agree that should be the intended action. Thank you for adding that clarification to the top of the post.

I noticed the issues with position and size. I've now fixed those in my own code. (I'd been writing it from memory without looking at the current specification. Always important to verify!)

I'm planning on running my code against the output of TweeGo and Twee2 to verify (Twee1 and Twee2) things, but it is coming along nicely.

Examples: How many line breaks go between passages, is whitespace added between the passage header mark (::) and the passage name, what the Twine 2 Story JavaScript and Story Stylesheet sections are named within Twee notation, if empty Twine 2 Story JavaScript and Story Stylesheet sections are written to Twee notation at all, etc.

Exactly. I'll have to look at what you and Twee2 do, but if neither of y'all output empty 'style' and 'script'-tagged passages, I'll just go along with that as well.

I think my code will use "UserStylesheet" and "UserScript" because the Twine Cookbook includes them, but if we want to talk about preferences for names, I can incorporate those. I'm up for changing things toward a new common usage that can match in our compilers as well as in the Cookbook itself when I get back to updating things in a few weeks.


@greyelf:

The Twee1 and Twee2 formats share the same related issues. If you mix in extra meta-characters, it can confuse TweeGo and Twee2 in the same ways.

At least in this case, you can point at the specification and blame a document now.

tmedwards commented 5 years ago

@greyelf I don't expect that the vast majority of Twee authors will write anything that would actually require escapement.

This is primarily for the extreme, edge cases in manually written Twee and for Twee decompiled from a Twine 2 source—mostly the latter, so round trips just work regardless of what crap authors stuff into passage and tag names.

Also. If authors don't want to use the escapement mechanism, they could simply change their passage and/or tag name(s) to remove the metacharacters. Either way, this really should not be that difficult to explain.


@videlais

I'm planning on running my code against the output of TweeGo and Twee2 to verify (Twee1 and Twee2) things, but it is coming along nicely.

Tweego's cloud repo has not been updated in quite a while—and it's not going to be until this flu stops kicking my ass—so its current behavior and codebase isn't going to be overly helpful in this regard.

videlais commented 5 years ago

Do we feel we are ready to convert the checklist into a markdown document? Have we covered everything we think needs to be articulated? And, if so, do we want to explicitly state that anything not mentioned is up to a compiler to decide (e.g. spacing between elements)?

greyelf commented 5 years ago

I don't expect that the vast majority of Twee authors will write anything that would actually require escapement.

That's parting because we have been telling them to not to use anything but letters or numbers in the passage names. Now that we are officially telling they that they can use whatever characters / glyph they like it may be a different story.

mcdemarco commented 5 years ago

I don't expect that the vast majority of Twee authors will write anything that would actually require escapement.

That's part[ly] because we have been telling them to not to use anything but letters or numbers in the passage names. Now that we are officially telling they that they can use whatever characters / glyph they like it may be a different story.

I missed that message and have been using any fragment of normal prose, including dialog and punctuation. I imagine the GUI makes it even easier to auto-create crazy passage names from links. The issues with tags and link syntax have probably kept square brackets out of Twine passage names so far, though.

tmedwards commented 5 years ago

Rises from flu coma.


@videlais

Do we feel we are ready to convert the checklist into a markdown document? Have we covered everything we think needs to be articulated? And, if so, do we want to explicitly state that anything not mentioned is up to a compiler to decide (e.g. spacing between elements)?

  1. I think we're ready to start the initial draft, at least.
  2. If not, they'll hopefully come up during the draft.
  3. Sounds fine to me.

@greyelf

Now that we are officially telling they that they can use whatever characters / glyph they like it may be a different story.

I'm confused as to why you feel that you need to tell them anything significantly different.

The only thing that's changed is that the optional block metacharacters may be used within passage and tag names, provided they're escaped, without breaking parsing.

There are still plenty of good reasons to recommend avoiding those and other characters within passage and tag names. For example, I'd recommend against using any of the link markup separator metacharacters—e.g., |, ->, <-—within names for obvious reasons, even though that would be perfectly legal.

I see no reason not to continue to advise authors to avoid any and all metacharacters.


Falls back into flu coma.

videlais commented 5 years ago

Here is what I have based on the checklist. I'm thinking we should include examples in parts and one full example at the bottom of the document.


Twee3 Specification

Introduction

Twee is the source code of a Twine story. It is the plain-text equivalent of a Twine HTML file containing only passage information.

Twee3 has been designed to represent Twine2-style HTML structures where data is storied in <tw-storydata> and <tw-passage> elements.

File Type

It is recommended Twee files have a .tw or .twee file extension.

Notation

Sections of a Twee file are divided into passages.

Passages are divided into header and passage content.

Passage Header

Each header must be a single line and composed of the following:

Tag Block

Tags are enclosed between opening and closing square brackets. Tags are separated by spaces.

Passage Metadata

Metadata should be proper in JSON formatting.

Two optional values include

Passage Content

Content can be any valid UTF-8 text.

Special Passages

StoryTitle

The project's name. Maps to <tw-storydata name>

StoryMetadata

Metadata should be proper in JSON formatting.

The story metadata should be pretty-printed—i.e., line-broken and indented. Passage metadata must be inline. Metadata decoding errors should be non-fatal warnings—i.e., discard the metadata and attempt to continue processing.

Optional Story Metadata Properties:

Start

The default starting passage. May be overridden by the story metadata or compiler command.

Compiler Tags

Name and Tag Escaping

Within passage headers, passage and tag names must escape the optional tag and metadata block opening/closing metacharacters—i.e., [, ], {, }.

specification.md.txt

greyelf commented 5 years ago

I'm confused as to why you feel that you need to tell them anything significantly different.

Up until now the fact that the Author could use metacharacters and such was more of a side-effect of the current implementations (editor applications, story formats, & compilers) than an actively supported feature. I believe this has now changed because the new specification includes clauses for handling such usage, which in turn encourages those implementing it (editor applications, story formats, & compilers) to support such functionality too.

videlais commented 5 years ago

Other than potentially adding some examples to different parts, do we all (@tmedwards, @mcdemarco , and @greyelf) feel that the current specification document is good? Any changes?

Are we ready to present and/or talk about it during the upcoming async Twine group meeting this month?

greyelf commented 5 years ago

I have no issues with the current consensus being forwarded to the Twine group at the next meeting.

I do suggest that a concise explanation of exactly how an implementer of the specification determines when a TWEE project is version 1.x or 3.x, especially when that project is made up of more that a single TWEE file. That way it will be clear to them when to enforce/process TWEE v3 features and when not to.