mbakeranalecta / sam

Semantic Authoring Markdown
Other
79 stars 8 forks source link

Conditions on a record entry? #201

Open 0x8000-0000 opened 4 years ago

0x8000-0000 commented 4 years ago

This is similar to #38 . For our use case, I'd like to be able to have conditions attached to an entire "row" of a recordset.

recipe:: amount, ingredient
    (?vegan) 50ml, oatmilk
    (?non-vegan) 50ml, half-and-half

How should the record-level condition be separated from the first field of the record?

mbakeranalecta commented 4 years ago

Interesting proposal. The difficulty I see is that the attribute markup (? is not a markup start sequence in itself, it relies on some other markup indicator, which it then modifies. But, unlike the cases discussed in #38, there is no markup start sequence for a record; it is indicated purely by context. Parentheses are such a common feature of text that using them as a markup start sequence would be a major headache. Imagine a recordset in which parens were used to indicate negative values, for instance.

Another approach to this use case would be to build the condition into the semantics of the recordset itself. Thus either:

recipe:: vegan, amount, ingredient
    yes, 50ml, oatmilk
    no, 50ml, half-and-half

or

recipe:: condition, amount, ingredient
    vegan, 50ml, oatmilk
    non-vegan, 50ml, half-and-half

There is one of the fundamental compromises of markup design here. A system like XML that is (almost) context free and where markup indicators are (almost) universal tends to verbosity and unnatural constructs. A system like SAM and other lightweight markup languages that rely heavily on context can be much lighter and more natural, but can't recognize markup in all locations, only those where the context permits.

0x8000-0000 commented 4 years ago

Your approach is what I'm prototyping but if feels odd that some conditions are attributes - and promoted to HTML attributes on rendering, and some conditions would be "plain text".

It could be made to work for XML and some amount of XSLT massaging, but I was hoping for something that works "out of the box" with HTML and pure CSS.

mbakeranalecta commented 4 years ago

After 25 years in the markup field, I have come to the conclusion that there are going to be things that feel odd about every system that is at all general. The whitespace rules in XML feel exceedingly odd, for instance, when you are writing documents. They are necessary to avoid ambiguity, but they feel odd for many use cases.

The converse of the oddity here is that is would feel odd to have to escape an opening parenthesis that happend to be the first character in the first field of a record.

Markup design is largely about deciding which oddities are easiest to live with.

Part of the problem with trying to create markup that is easy to write is that natural writing uses markup for some structures (bullets for lists, for instance) and mere whitespace for others (paragraphs, indents). So a markup based on natural writing conventions has markup characters on which to hang attributes in some circumstances and not in others. That creates the oddity you mention, but doing it any other way creates a different set of oddities.

Also, design wise, the direct HTML output from SAM is really an afterthought. The main design intent was to produce a fully semantic markup language that would go through post-processing to produce multiple outputs. So the HTML output is not intended as a general solution. There are lots of things for which the design expectation, even with the tool as it is now, is that you would write an XSLT script to transform the SAM structures to HTML or other output formats.

Direct HTML output, then, is the poor stepchild in the SAM design. Indirect output via an XSLT (or other language) transformation is the "normal" output method for all formats, including HTML. The design intent is very much to make the markup itself as natural as possible and to have complex formatting problems solved on the backend by processing the semantics captured in the markup.

0x8000-0000 commented 4 years ago

Markup design is largely about deciding which oddities are easiest to live with.

Mark,

Thank you very much for the thoughtful and insightful reply!

I guess I'll bite the bullet and get going with some XSLT.

florin

0x8000-0000 commented 4 years ago

What if we add a field for condition in the record definition?

recipe:: (?), amount, ingredient
    (?vegan), 50ml, oatmilk
    (?non-vegan), 50ml, half-and-half

or

recipe:: amount, ingredient, (?)
   50ml, oatmilk, (?vegan)
   50ml, half-and-half, (?non-vegan) 

This way, we can always check that we have as many fields as column definitions, and we allow the user to specify which column contains the condition.

(?) would indeed be "magic syntax", but just as magic as :: . At least it would be quite explicit what's going on.

mbakeranalecta commented 4 years ago

That might solve the mechanical issue of markup recognition, but, again, the design intent is to keep SAM simple and to capture semantics in the data to meet a variety of use cases. So many markup systems have foundered because they added more and more features to the markup, to the point where it becomes harder and harder to parse with the eye. This proposal is a step down that road. A small step, perhaps, but all the individual step are small. It is the sum of them that creates the problems. To prevent going too far down that road, the design principle has to be to solve most use cases through the interpretation of semantics in the processing layer, not the addition of syntax to the markup layer. (There is a reason MarkDown is popular: Its stark simplicity. SAM is about being as simple as possible while still being able to capture semantics to enable a much wider set of use cases than Markdown.)

A record in SAM is just that, a database record, a piece of comma-delimited tabular data embedded in the markup. The semantics of that record belong in the data dictionary for that record, not in the markup. That is the approach that preserves both the simplicity of SAM markup and the flexibility to define semantics for multiple use cases.

0x8000-0000 commented 4 years ago

Perhaps in time I'll come to understand the wisdom of with your position, but for now I am too bothered by the inconsistency between block conditions and record level conditions so I'll take a stab at implementing this proposal to get a feel for what it looks like and how it impacts the adoption in our team.

mbakeranalecta commented 4 years ago

Sure. Sometimes the best design is not the purest design. (And sometimes one person's idea or what constitutes purity in design is different from another.) Trying different variants in the field is a good way to test things. (True, fragmentation could be a problem down the road, but first your have to get people moving down the road before any of that is going to matter.)

If you do this, though, please think about what the XML output is going to look like, and what the internal SAM content model is going to look like, not just the HTML output.

0x8000-0000 commented 4 years ago

If you do this, though, please think about what the XML output is going to look like, and what the internal SAM content model is going to look like, not just the HTML output.

Yes, very much so.

The intent is to use SAM documents as the source of truth and generate XML which will be transformed into DocBook to be merged with other documentation into final PDFs. The HTML representation is just for quick visualization. I am also interested into a solid and coherent data model because we'll be generating code (data structures and enumerations and validation conditions) from the elements stored in the tables.

mbakeranalecta commented 4 years ago

Let me suggest a different syntax for this. If you can implement it successfully, I would be willing to accept the following as canonical.

The issue is that there is no markup to attach the attribute too at the beginning of a record. One solution to that is to add some markup for that purpose. Fortunately, the recordset structure actually declares the markup for a record in the header (and, at one point, I contemplated allowing the writer to specify the separator character in the header). Building on this idea, we could allow the header to declare an opening markup character for each row:

recipe:: , amount, ingredient
    ,(?vegan) 50ml, oatmilk
    ,(?non-vegan) 50ml, half-and-half
    , 1 cup, flour
    , (5 ml), arsenic

With this syntax, the attribute is unambiguously an attribute. No case can arise in which a value with a parentheses would have to be escaped. It is also backward compatible with all existing documents.

If the leading comma is specified in the header it is required on all rows. If it is not specified in the header, it must not occur in the record. There must be a space between the leading comma and the beginning of the first field value.

I'd still encourage people to build conditional rows into the semantics rather than using syntax like this, but this approach gives the option to use syntax without adding any complexity or gotchas for users who choose not to use it.

0x8000-0000 commented 4 years ago

Let me suggest a different syntax for this. If you can implement it successfully, I would be willing to accept the following as canonical.

Deal!