mbakeranalecta / sam

Semantic Authoring Markdown
Other
79 stars 8 forks source link

Should serialization avoid attributes to allow for inserting additional structure. #119

Closed mbakeranalecta closed 6 years ago

mbakeranalecta commented 7 years ago

There are a number of places where the current serialization inserts information in the form of attributes. For instance, a citation uses <citation type='citation' value="Melville, 1851">. This precludes letting the architect use a RE in the schema to break the value up into pieces, such as separating author and date.

This could be done like this:

<citation type="citation">
    <citation-value>
        <author-name>Melville</author-name>
        <year>1851</year>
    </citation-value>

The problem with using elements is that they could conflict with elements created by the document itself, but this should not be a problem with citations unless the writer decides to do this:

"""[Melville, 1851]
    citation:
        citation-value:
            author-name: Melville
            year: 1851

But this seems unlikely, and it would still be different because the generated citation would have the type attributes, which the authored version could not have.

mbakeranalecta commented 7 years ago

Actually, since there is no way to create arbitrary attributes in SAM, the use of an attribute will always distinguish a SAM generated structure from an explicitly authored one.

mbakeranalecta commented 7 years ago

Actually, we may not need anything so elaborate as in the first instance. Citations are either text or a nameref, idref, or keyref. If they are text then rendering them as:

<citation>Melville 1851</citation> 

Is perfectly adequate and fully distinct from the ref versions. And it leave room for the pattern feature to expand this into:

 <citation>
    <author-name>Melville</author-name>
    <year>1851</year>
 </citation>

The writer can of course do this:

 """[Melville, 1851]
    citation:
        author-name: Melville
        year: 1851

Which will also serialize as:

 <citation>
    <author-name>Melville</author-name>
    <year>1851</year>
 </citation>

But so what? The schema still sees the same structure either way, as does the application layer, so no harm is done.

mbakeranalecta commented 7 years ago

It is not clear to me that moving to elements for Attributes and Annotations gains us anything. The only cases that occur to me are conditions and the specifically attribute.

If the specifically attribute were eligible to be a pattern, then the value of a phrase would be eligible too. This makes some sense. You could do something like:

{12:30}(time) 

And have it produce:

<phrase><annotation type="time"><hour>12</hour><minute>30</minute></annotation><phrase>

But then what do you do with

{half past noon}(time "12:30")

Normally that would serialize to:

<phrase><annotation type="time" specifically="12:30">half past noon</...

You would have to have a completely different serialization of specifically to make this work. (Shortcutting the annotation markup here, since that is a planned option.

<phrase><time><specifically><hour>12</hour><minute>30</minute><specifically>half past noon</...

But what does that look like if you have nested annotations?

<phrase><time><specifically><hour>12</hour><minute>30</minute><specifically><bold>half past noon</...

One of the features of attributes in XML processing is that they are ignored by default, whereas elements are processed by default, so if you serialize specifically as an element, the programmer has to write a rule to explicitly ignore it. Not the end of the world, but a complication.

mbakeranalecta commented 6 years ago

It seems to me on reflection that there is little point in supporting patterns for the specifically attribute. The role of patterns it to enforce a particular style of data entry, and if you want to enforce a style of data entry on the writer, you do it on the original text, not the clarification of that text. The specifically attribute exists to allow clarification of informal expressions of an idea, and if you are insisting on a particular type, you are already outlawing informal expressions. Ergo, applying patterns to the specifically attribute in moot.

mbakeranalecta commented 6 years ago

I'm not finding sufficient reason to make this change, so closing.