mbakeranalecta / sam

Semantic Authoring Markdown
Other
79 stars 8 forks source link

Variable scoping rules #171

Closed mbakeranalecta closed 6 years ago

mbakeranalecta commented 6 years ago

The current string scoping rules say:

Within a local file, strings are local to the scope in which they are defined.

Within a local file, a later definition of a string overrides an earlier definition.

Strings defined in a fragment insert override the strings defined in the fragment.

Within a local file, a local definition of string overrides a more global definition of a string.

The application layer is entitled to introduce string definitions from outside the local file and to impose its own rules on precedence for such strings.

But this does not make it clear if a string definition has to occur before it is referenced. The rules say that a later definition overrides an earlier one but that does not tell us what to do if a string is redefined after is has been referenced. Or what to do if the definition of a string comes after its reference.

It is clearly much easier to implement if the rule is simply that we look back up the tree for a definition. Otherwise it gets complicated to say what happens if a string is defined in an outer scope, redefined in and inner scope, then redefined again in the outer scope.

But there are definitely cases in markup languages, Markdown being an instance, in which string can be defined after they are referenced. (Markdown commonly does this for URLs, for instance.) Of course, Markdown does not allow scoping of variables.

Ease of implementation matters here because other than in HTML output mode, the SAM parser does not resolve strings. It passes them to the application layer so that the implication layer can implement string lookup over a larger data set than the current page. But this demand that the algorithm be easy to implement at the application layer. Simply looking back up the three for the first matching definition is easy to implement.

mbakeranalecta commented 6 years ago

As currently implemented, this input:

            $str = One
            section:(*section.foo) foo
                $str = Two

                This sentence should end with the word for 2: >[$str]

                $str = Four

            section: bar
                This sentence should end with the word for 1: >[$str]

                $str = Six

            >>>[*section.foo]
                $str = Deux

            $str = Five

Gives this output:

foo
This sentence should end with the word for 2: Two

bar
This sentence should end with the word for 1: Six

foo
This sentence should end with the word for 2: Deux

That is, the middle example is finding the definition of the string after the insert.

With this markup:

                section:(*section.foo) foo
                    $str = Two

                    This sentence should end with the word for 2: >[$str]

                    $str = Four

                section: bar
                    This sentence should end with the word for 1: >[$str]

                >>>[*section.foo]
                    $str = Deux

                $str = Five

We get:

foo
This sentence should end with the word for 2: Two

bar
This sentence should end with the word for 1: Five

foo
This sentence should end with the word for 2: Deux

Here the lookup is going back up the tree and finding the definition at a higher level but still after the insert.

In other words, the search order is, the current scope from top to bottom, and then the next higher scope from top to bottom, and so on.

The alternative would be this scope, from the top to the present node, and then the next higher scope from the top to the parent of the current node, and so on, so that no definition can be found after the point at which the insert occurs.

Which is the better rule? Is there value in allowing people to define a string value after the point at which it is used in document order, or does this make the lookup order harder to understand, potentially leading to unintended consequences.

Alternatively, is not allowing definition of a string after it is referenced an unnecessary and confusing restriction. We should not that there is no restriction that reference by ID or name (or key) must occur before the insert. Of course, there can only be one object with an ID and currently there is no defined behavior for resolving duplicate names. (I think the general intent is that names should be unique across the content set, but there is no intent to enforce that, so maybe the same scoping rules should be considered.)

mbakeranalecta commented 6 years ago

An added wrinkle. With the current implementation, the following:

            markup:
                $str = One
                $str = Un
                section:(*section.foo) foo
                    $str = Two

                    This sentence should end with the word for 2: >[$str]

                    $str = Four

                section: bar
                    This sentence should end with the word for 1: >[$str]

                >>>[*section.foo]
                    $str = Deux

                $str = Five

Produces:

foo
This sentence should end with the word for 2: Two

bar
This sentence should end with the word for 1: One

foo
This sentence should end with the word for 2: Deux

The issue here is that the second example is resolving to One even though Str was redefined to Un closer to the insert statement.

This would definitely be unexpected. This seems to argue strongly for a rule that says that the definition of a string must come before it is referenced. The alternative would be to search up the tree and then, if no definition was found, search down from the same starting point. This seems too complicated.

mbakeranalecta commented 6 years ago

Need to document the difference between searching up the tree and searching in reverse document order.

Should also note the difference between string and annotation lookup, since annotation lookup is reverse document order. Also explain why they are different. (Annotations are not dependent on document structure.)

mbakeranalecta commented 6 years ago

Searching up the tree is now implemented since before 3e9b8f6fd8cddf9cbedb25c44ab48323216ce71e

mbakeranalecta commented 6 years ago

Documented in 0b71073284a0e079833263af0c4cadc38595bab7.