ubermichael / isetools

Tools for parsing data for the Internet Shakespeare Editions
GNU General Public License v2.0
2 stars 3 forks source link

Feature: auto-closing tags #2

Closed telic closed 2 years ago

telic commented 9 years ago

Several tags could be simplified by allowing them to be "auto-closing". In particular, the end of these tags can all be quite easily implied, rather than requiring the editor to explicitly close them:

Self-closing tags could also be made auto-closing (L, TLN, QLN, WLN, BR, RULE, SPACE).

ubermichael commented 9 years ago

I'm not sure about parts of this.

Tags that should be empty (L, TLN, QLN, etc) are already converted to empty tags by one the validator, so that's essentially done, but needs testing. Could you open a separate issue for that?

ubermichael commented 9 years ago

The rules for auto-closing will be more complicated than that. All open tags must be closed at EOF, in the appropriate order.

There are probably more complications.

ubermichael commented 9 years ago

Thinking a little more on this, some tags would need to automatically close at the start of some other tag (ACT auto closes at a start ACT tag), but others need to close at the end of some other tag (S auto closes at the end ACT). So however we encode this (it should probably be in the schema somewhere), we need to include start, end, and probably any.

ubermichael commented 9 years ago

Maybe something like this:

<tag name="COL" where="oldspelling">
  ... all the current stuff unchanged ...
  <autoclose tag='start:COL'>
  <autoclose tag='end:PAGE'>
</tag>
telic commented 9 years ago

Tags that should be empty (L, TLN, QLN, etc) are already converted to empty tags by one the validator

Ah. I didn't notice that! I hadn't actually tried, I just didn't notice something in the code that did it on casual inspection :)

telic commented 9 years ago

S would have to close before ACT, SCENE, or DIV for example

Why? I could imagine an editor wanting to create divisions that split a speech, but still retain the fact that it is meant as a single speech. I'm not so sure about the act/scene case, but I'd rather let the editor decide what's appropriate. In either case, we shouldn't require editors to close tags in a specific order.

MODE would be better off as a milestone, I think.

I agree; that's what we're doing in the XML serializer. The closing tag is simply a short-cut for <mode t="uncertain"/>.

ubermichael commented 9 years ago

What is the use case for a single speech that starts in one act or scene and ends in another?

telic commented 9 years ago

In the case of a scene break used as a time-cut in the middle of a speech, it could be desirable to treat the start and end of the ellided dialog as a single speech.

I'm just making stuff up though; my real point is that I'm not certain, so I'd rather let the editor decide. I'm not sure what we gain by forcing the speech to end at a scene division.

telic commented 7 years ago

Why was this closed? Did we make a decision not to do this for some reason? It doesn't look like the schema was ever changed to include auto-closing information, nor were any new validators/transformers created to handle the issue.

ubermichael commented 7 years ago

I have no idea why I closed it.