Proposal: Shadow lines - Githubissues

Introduction

This proposal introduces shadow lines, which allows writers to re-use lines across their game without having to create duplicated line text or assets, and also preserving a natural flow for writing conversations in which there are multiple paths that include a common line.

Rationale

There are several circumstances in a conversation where a single line might need to appear in more than one location.

In the following example, the player is given three options. Two of the options result in the player character delivering an identical line:

Bartender: What are you having? #line:bartender_greeting

-> Wine
    Player: A glass of your finest! #line:player_wine
-> Beer
    Player: The cheapest swill you have! #line:player_beer_1
-> Don't care
    Player: The cheapest swill you have! #line:player_beer_2

Bartender: Coming right up! #line:bartender_end

Yarn Spinner uses line IDs to find localised text and assets for each line. Line IDs are required to be unique.

In the preceding example, the two instances of "The cheapest swill you have!" have their own unique line IDs, despite having the same text. The following tables will be produced:

String Table	Line ID	Text
`bartender_greeting`	Bartender: What are you having?
`player_wine`	Player: A glass of your finest!
`player_beer_1`	Player: The cheapest swill you have!
`player_beer_2`	Player: The cheapest swill you have!
`bartender_end`	Bartender: Coming right up!

Metadata Table

Line ID	Metadata
(table contains no entries)

Note that the string table entries player_beer_1 and player_beer_2 contain redundant text. If these lines were voice-acted, then a redundant copy of the recorded audio would need to be provided as well.

This means that there's no way to tell Yarn Spinner that it should use the same text or assets for the line, and the easiest way to handle the situation is to accept a full duplication of the lines. This has consequences for storage, and for production - which of the two lines is the 'canonical' version?

Proposed solution

Shadow lines represent an instruction that an existing line from elsewhere in the dialogue should be run in its place.

A shadow line represents a "copy" of another line. They don't have their own line IDs, and don't represent a new entry in a localised string table. This means that situations where the same content needs to be presented in different contexts can written in a much more natural way, respecting the flow of the conversation without having to contort the structure to avoid duplication.

For example:

Bartender: What are you having? #line:bartender_greeting

-> Wine
    Player: A glass of your finest! #line:player_wine
-> Beer
    Player: The cheapest swill you have! #line:player_beer
-> Don't care
    Player: The cheapest swill you have! #shadow:player_beer

Bartender: Coming right up! #line:bartender_end

In this example, the first instance of the line player_beer ("The cheapest swill you have!") is a normal line. The second instance of the line is a shadowed copy of the line. When the player chooses the option 'Don't care', the text of the line player_beer is run.

When localising this dialogue, or producing a voice-over script, the following tables are generated:

String Table	Line ID	Text
`bartender_greeting`	Bartender: What are you having?
`player_wine`	Player: A glass of your finest!
`player_beer`	Player: The cheapest swill you have!
`bartender_end`	Bartender: Coming right up!

Metadata Table	Line ID	Metadata
(compiler-generated)	`#shadow:player_beer`

Note that only one instance of the line player_beer exists in the string table, and no entry is produced for the shadow line. A single entry in the metadata table is produced for the shadow table.

Detailed design

The implementation of this feature occurs in the compiler, the importer, and the line provider (on an engine- and localisation-specific basis.)

Yarn Spinner Compiler

A shadow line is a line_statement that has a hashtag beginning with the string #shadow:.

The following rules are enforced for shadow lines:

The #shadow: hashtag must end with an explicit line ID found elsewhere in the program. This referenced line is referred to as the source line.
The line_statement must have either the same line_formatted_text as the source line, or be blank (that is, it contains only hashtags.)
- The line may have other hashtag entries of its own, and is not required to have the same hashtags as the source line.
Only a single #shadow: hashtag is permitted for a line.
The shadow line is permitted to be blank (that is, no text prior to the #shadow: hashtag.)
The shadow line may not have a #line: hashtag of its own.
- This means that the compiler will always generate an implicit line ID for shadow lines.
- This also means that it is not possible to make a shadow line reference another shadow line.
- The automated line tagger will not consider shadow lines for analysis.

If any of the above rules are violated, an error is generated.

If a shadow line is blank, then code editing tools should display the text of the source line as an inlay hint, where possible.

`ysc`

Shadow lines are omitted from the generated string table when exporting a compilation result, and are included in the metadata table, following the example given in the Proposed Solution.

Yarn Spinner for Unity

When receiving the compiled result from the compiler, the importer detects any entries in the returned string table that contain a #shadow: hashtag.

The specific behaviour depends on which localisation tool is used.

Yarn Internal Localisation

When importing using Yarn Internal localisation, shadow lines are not included in the string table, but they are included in the metadata table.

When a line provider is instructed to provide localised content for a line, it queries the metadata table for the provided line ID, and determines if it is a shadow line. If it is, it fetches the string and assets for the corresponding source line, instead of for the requested line.

Unity Localisation

Unity Localisation stores metadata in a variety of locations, depending on context. Metadata can be stored on a localised line, a localised string table, or a string table collection. Because we do not store a string table entry for shadow lines, and shadow line mappings are not localised, the appropriate place to store the mapping is on the string table collection.

When importing using Unity Localisation, shadow lines are not considered when populating a localised string table or a localised asset table. Instead, an entry is added to the shared metadata of the importer's destination string table collection. This entry contains the line ID of the shadow line, the line ID of the source line, and the shadow line's metadata.

When a line provider is instructed to provide localised content for a line, it queries the string table collection's shared metadata for a record indicating that the line is a shadow line. If it is, the corresponding source line's content is fetched, instead of for the requested line.

Other Engines

The overall guidance for non-Unity engines is: when the VM runs a line, its metadata should be fetched and checked for the presence of a #shadow: tag. If one is present, the appropriate source line's text and (and any other content) should be fetched. The shadow line's metadata is used, not the source line's.

Backwards Compatibility

This feature is mostly additive, and does not affect the compilation behaviour of most existing scripts. The exception is any scripts that make use of a #shadow: hashtag of their own, which will break.

Alternatives considered

I've considered two possible alternatives to this proposed solution: do nothing, or allow lines to share line IDs.

Do Nothing; Use Node Structures Instead

One solution to this problem that's currently possible without any changes to Yarn Spinner is to restructure the dialogue to avoid duplicating the line. For example:

title: Bartender_Start
---
Bartender: What are you having? #line:bartender_greeting

-> Wine
    Player: A glass of your finest! #line:player_wine
    <<jump Bartender_Done>>
-> Beer
    <<jump Bartender_Beer>>
-> Don't care
    <<jump Bartender_Beer>>
===
title: Bartender_Beer
---
Player: The cheapest swill you have! #line:player_beer
<<jump Bartender_Done>>
===
title: Bartender_Done
---
Bartender: Coming right up! #line:bartender_end
===

This structure avoids duplicating the line "The cheapest swill you have!", and its line ID, by keeping a single instance of it in a separate node that can be reached via two paths.

This avoids the problems of duplication, but the writer's flow is made more complex by lifting one of the responses out into its own node. This is harder to read, and more cumbersome to write and modify.

Allow Duplicate Line IDs

Another alternative solution would be allow lines that have identical content to share line IDs.

In the following example, the line ID player_beer is used in two locations:

Bartender: What are you having? #line:bartender_greeting

-> Wine
    Player: A glass of your finest! #line:player_wine
-> Beer
    Player: The cheapest swill you have! #line:player_beer
-> Don't care
    Player: The cheapest swill you have! #line:player_beer

Bartender: Coming right up! #line:bartender_end

This would remove any requirement for storing a lookup table. However, this creates new problems when dealing with metadata, and for analysis tools.

The compiler automatically creates a #lastline hashtag for any line that immediately precedes an options group, which means that any set of lines with the same line ID would have different metadata if any of them differ in whether or not they immediately precede an options group.

Additionally, there are reasons why a user may wish to use different metadata for multiple instances of a line. A line shown at different points in a conversation will likely be presented in different contexts, and any metadata used to control elements like camera shot selection or character animation may differ. A user, therefore, may wish to have different line metadata.

If line IDs remain the only key, the only ways to represent this would be to either enforce no difference between lines (including metadata), or to track any differences in metadata (which brings us back to needing a separate table, negating the benefit of this approach).

Acknowledgments

@McJones provided design review for this proposal.

not really part of the proposal but something worth considering for the specifics of the implementation, we could create suggestions for untagged lines that are identical "do you want to make this a shadow line of line x?"

That's a very good idea.

That said... short lines like "yes" or "I should go" might get flagged as potential duplicates too often, and we'd need some way to persist a user selection that they shouldn't be grouped. Perhaps a better solution would be to just offer it as a code action rather than a diagnostic.

yeah I was thinking a suggestion like how xcode has those little light bulbs in the margin and if you click them it goes "did you mean this?" Wow an xcode feature coming in handy, truly strange times.

My first suggestion would be to allow duplicate line ids as well, it's the most intuitive way to get across that duplicate line ids are allowed and having the line metadata be exactly the same won't be a problem in most scenarios. I'm making the assumption that most projects don't have line metadata for every line (apart from line ids), and in the case where the "shadowed" line needs unique line metadata then its not the biggest deal to convert it into the #shadow:ID syntax.

The VScode extension can highlight the edge cases where having a duplicate line ID would cause problems, or maybe just suggest the shadow tag as better practice, but in terms of workflow its much easier to just not think about it too hard and just slap the same line id when you know its the same line.

I also want to ask would "shadow" lines work across yarn files the same way variables do? I would also imagine there would be confusion on when in execution order the "shadow" lines need to appear in relation to the source line (as in shadow after source etc.), similar to the questions around declare. Hence my push to allow duplicate line ids, in most simple scenarios you won't need to think about it.

On reflection I quite like the idea of just allowing duplicate ids, but I think we can take it one step further. While we do still need to know which line is a shadow line of another for metadata purposes that is purely an implementation detail, not a writing one.

So when the compiler encounters multiple lines with the same ID but different metadata it can essentially pick the ones that are canonical and shadow arbitrarily and build up the string table without this ever being visible to the user. Then shadow lines become a specific implementation detail to link metadata to lines, essentially a foreign key, and we don't need the new hashtag to be defined or ever typed. Alternatively could also do this but make all duplicate lines count as shadows and have all of them have to follow the same multistep lookup for the line, then there is no distinction or unique flow for one line but not the others.

I've been thinking about this for a few days and would like to contribute some thoughts as a new member of this community:

Why not jump? The original proposal leaves out an alternative implementation whereby the compiler allows for a line_id:reference tag, and then jump reference to jump directly to a line instead of any arbitrary node.

This brings up an important question that the original proposal doesn't answer however:

Bartender: What are you having? #line:bartender_greeting

-> Wine
    Player: A glass of your finest! #line:player_wine
-> Beer
    Player: The cheapest swill you have! #line:player_beer_1
       -> Another Option
       -> Another One
-> Don't care
    Player: The cheapest swill you have! #line:player_beer_2

Bartender: Coming right up! #line:bartender_end

In that example, what happens if the player chooses Don't care? Do we fully walk down the option line of Another Option? If we do, then this proposal is just a simplified jump syntax, and I think should be reworked to use jump instead.

If we don't, I would be in favor of some other built in command (for example, << insert player_beer_1>>) rather than using the tags as a kind of linting mechanism to the compiler.

Why not jump? The original proposal leaves out an alternative implementation whereby the compiler allows for a line_id:reference tag, and then jump reference to jump directly to a line instead of any arbitrary node.

A line jump would work in this example, but the flexibility of the "shadow" tags in general allow for duplicated lines of text to appear without the need to redirect the flow. For example you can have a character that has a catchphrase, and instead of having the line data be duplicated for translation and VO it could simply be referenced as a copy. With only a line jump you'd have the awkward conundrum of figuring out a way to jump back to the normal flow every time your character speaks their catchphrase, or just not bother and deal with duplicated data.

That isn't to say having a jump to a specific line isn't useful, and its a requested feature for yarn (especially to start the dialogue from a specific line), it just isn't useful for this scenario.

If we don't, I would be in favor of some other built in command (for example, << insert player_beer_1>>) rather than using the tags as a kind of linting mechanism to the compiler.

This was mentioned in the discord during a discussion explaining the feature, but it hasn't made its way onto the github discussion. Paraphrasing @/desplesda; You could just put the tag in place of the line instead of needing to type out the entire line, and the compiler will automatically fill in the needed text. So:

#shadow:player_beer

Will display as:

Player: The cheapest swill you have!

But the compiler will throw an error if you do manage to type out different text for the shadow line compared to the source line.

In that example, what happens if the player chooses Don't care? Do we fully walk down the option line of Another Option? If we do, then this proposal is just a simplified jump syntax, and I think should be reworked to use jump instead.

In this scenario, choosing Don't Care would display player_beer_2 then bartender_end, it would not go down the branch of Another Option at all. The dialogue flow is essentially:

graph LR;
    bartender_greeting--Wine-->player_wine;
    bartender_greeting--Beer-->player_beer_1;
    player_beer_1--Another Option-->another_option_line;
    player_beer_1--Another One-->another_one_line;
    bartender_greeting--Don't Care-->player_beer_2;
    player_wine-->bartender_fin;
    another_option_line-->bartender_fin;
    another_one_line-->bartender_fin;
    player_beer_2-->bartender_fin;

Right, I saw this conversation. In my opinion, if we're driven to allow bare tags, such as:

#shadow:player_beer

Then I think this should be either a command << shadow player_beer>>, or some new syntax. It seems bizarre to allow for a tag which is tagging...nothing? Because the tag isn't really tagging anything -- it's really informing the compiler that you'd like it to insert a line in that location for you as a copy of another line. In that way, its really a compiler command. Differentiating that syntactically makes sense but isn't required at all, but imo, this isn't really a "tag" anymore!

If we commit that users must repeat text though, and the purpose of the tag is to combine metadata (as in the original spec), then a tag makes sense. Imo, that's a lot less appealing -- the purpose is to avoid repeating text, not to yell at you for mis-copying (though I suppose that's better than nothing!)

YarnSpinnerTool / YarnSpinner

Proposal: Shadow lines #359