Proposal on traceability

konnov commented 4 years ago

This is the first version of the proposal. It badly needs comments and thought :-)

shonfeder commented 4 years ago

Noting the relation to #4

josef-widder commented 4 years ago

I like the idea a lot and I agree that it can be done by hand, but that would be quite tedious. I think amazing things could be done with a tagging tool: In the artifact we could just mark places where a tag should go with "[]", then we run the tool, and it asks us for each mark, whether that references to something else, and you can just choose from a menu. As I noted in a suggestions, the tool could also alert changes, and asks us what to do.

konnov commented 4 years ago

We should distinguish between tag declaration and tag reference. That is why I reserved the square brackets for a tag declaration and simply a tag name for a tag reference. That was probably a mistake. If we replace FOO.1 with [FOO.1](...), then [FOO.1] looks like a tag declaration. Maybe, we should use [!FOO.1] for a tag declaration and [FOO.1] for a tag reference.

josef-widder commented 4 years ago

We should distinguish between tag declaration and tag reference. That is why I reserved the square brackets for a tag declaration and simply a tag name for a tag reference. That was probably a mistake. If we replace FOO.1 with [FOO.1](...), then [FOO.1] looks like a tag declaration. Maybe, we should use [!FOO.1] for a tag declaration and [FOO.1] for a tag reference.

My writeup was not clear: In my viewpoint, a declaration [X.1::Y.2] is both, (i) a declaration of X.1::Y.2 and (ii) a reference to X.1. So this dependency to another tag the tool could query.

konnov commented 4 years ago

My writeup was not clear: In my viewpoint, a declaration [X.1::Y.2] is both, (i) a declaration of X.1::Y.2 and (ii) a reference to X.1. So this dependency to another tag the tool could query.

Ah, true. The syntax immediately gives us the parent. This make the tool's job easier.

thanethomson commented 4 years ago

This is excellent! :tada:

It sounds like we're going to need find some existing tooling, or perhaps build some new tooling, to facilitate this. Mind if I start a proposal along these lines and submit it to this repo as a separate PR to start capturing those practical considerations? My sense is that we'll need to have some back and forth between this proposal and the tooling considerations.

shonfeder commented 4 years ago

We should distinguish between tag declaration and tag reference. That is why I reserved the square brackets for a tag declaration and simply a tag name for a tag reference. That was probably a mistake. If we replace FOO.1 with FOO.1, then [FOO.1] looks like a tag declaration. Maybe, we should use [!FOO.1] for a tag declaration and [FOO.1] for a tag reference.

Thinking about alternative tagging notation, I thought I'd add some notes on existing markdown conventions. We shouldn't confine ourselves to those conventions, of course, but they might be a source of inspiration and, to the extent that we do use those conventions, we'll get a fair bit of tooling for free.

In markdown (as parsed by pandoc) [FOO.1][] is an implicit reference link, for which the anchor can be supplied by

[FOO.1]: url/here

somewhere else in the file. And [FOO.1] is an implicit reference to an anchor #foo.1 on the same page.

Anchors are automatically generated for any headings, and can be explicitly provided for inline text. So, for example,

# Spec Title

## FOO.1

Requirements for foo.

## BAR.1

Requirements for bar. With a reference to [FOO.1].

We use pandoc test.md --standalone --section-divs --toc -o test.html to get

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
    <head>
        <meta charset="utf-8" />
        <meta name="generator" content="pandoc" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
        <title>test</title>
        <style>
         code{white-space: pre-wrap;}
         span.smallcaps{font-variant: small-caps;}
         span.underline{text-decoration: underline;}
         div.column{display: inline-block; vertical-align: top; width: 50%;}
         div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
         ul.task-list{list-style: none;}
        </style>
        <!--[if lt IE 9]>
            <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
        <![endif]-->
    </head>
    <body>
        <nav id="TOC" role="doc-toc">
            <ul>
                <li><a href="#spec-title">Spec Title</a>
                    <ul>
                        <li><a href="#foo.1">FOO.1</a></li>
                        <li><a href="#bar.1">BAR.1</a></li>
                    </ul></li>
            </ul>
        </nav>
        <section id="spec-title" class="level1">
            <h1>Spec Title</h1>
            <section id="foo.1" class="level2">
                <h2>FOO.1</h2>
                <p>Requirements for foo.</p>
            </section>
            <section id="bar.1" class="level2">
                <h2>BAR.1</h2>
                <p>Requirements for bar. With a reference to <a href="#foo.1">FOO.1</a>.</p>
            </section>
        </section>
    </body>
</html>

Which renders (without any styling) as

2020-03-26-095101_891x720_scrot

The upshot: if we put tags in # denoted headings preceding the logical unit and use [TAG] for local reference and [TAG][] for reference to tags in other files, then we could get internal cross-referencing and tables of contents for free using existing markdown conventions. Cross-document references would only require appending [TAG]: file/name.md#tag to the file, which is easy enough to do manually, and easy and non-invasive to add via a simple automation.

shonfeder commented 4 years ago

re: markdown conventions, it might also be worth considering definition lists for associating tags with logical units.

josef-widder commented 4 years ago

Thinking about alternative tagging notation, I though I'd add some notes on existing markdown conventions. We shouldn't confine ourselves to those conventions, of course, but they might be a source of inspiration and, to the extent that we do use those conventions, we'll get a fair bit of tooling for free.

I totally agree. With Ilina, we added such hyperlinks to specs, e.g., https://github.com/informalsystems/VDD/blob/master/lightclient/verification.md although we did not adhere to the syntax Igor suggested. But already while editing and fixing these specifications, these links were very helpful for me. What is not clear to me, how we can maintain such links over branches etc. I had the problem that I wanted reference something in a different branch. But perhaps this is not good practice anyhow...

konnov commented 4 years ago

In markdown (as parsed by pandoc) [FOO.1][] is an implicit reference link, for which the anchor can be supplied by...

I like that! For markdown, the tool could just fix the references in a designated area at the bottom of the page. For the source code and TLA+, we would probably still inline links in the comments. Github markdown cheat sheet mentions another syntax: one either uses [FOO.1][1] or just [FOO.1]. How compatible are different markdown dialects?

konnov commented 4 years ago

I had the problem that I wanted reference something in a different branch. But perhaps this is not good practice anyhow...

If you have two different versions of a requirement in different branches, these versions should have different revision numbers 😎

konnov commented 4 years ago

I have updated the proposal with the new syntax: tags are declared with !TAG! and referenced with [TAG]. By doing so we stay compatible with markdown and do not mix tag declarations with references. If you are okay with this, i will merge the PR.

thanethomson commented 4 years ago

I have updated the proposal with the new syntax: tags are declared with !TAG! and referenced with [TAG]. By doing so we stay compatible with markdown and do not mix tag declarations with references. If you are okay with this, i will merge the PR.

I have some questions and comments about the proposed syntax:

So if I understand it correctly, defining a tag at the beginning of a paragraph effectively attaches that tag to all paragraphs under the same heading (or until another tag is defined)?
Can we allow tag definitions in the headings/sub-headings, and let the hierarchy naturally emerge? See my example below.
Why exclamation marks specifically? Would you consider something less visually complex, like forward-slashes / or pipe symbols | for defining tags? e.g. /TAG/ or |TAG|
I really like the idea of being able to tag things pretty much anywhere throughout the document. This is a far more natural way of tagging requirements (imho).

# Traceability |TRC|

...

## 3. Traceability properties

1. |TAG.1| Tagging a logical unit should be easy.

...

## 4. Tag syntax |TAG.1|

|SYNTAX.1| We propose a simple naming scheme for tags. We start with
the tags for top-level requirements and then proceed with the tags of the
logical units that implement higher-level requirements.
...

Perhaps we can refine the reuse of the "definition" in heading 4 above to use a different syntax. What it's indicating is that any tags defined underneath this heading should be prefixed with the tag specified in the heading (unless the tag definitions in the paragraph use their fully qualified path).

P.S. I'm not 100% sure if the pipe symbol will work in all Markdown flavours.

shonfeder commented 4 years ago

In markdown (as parsed by pandoc) [FOO.1][] is an implicit reference link, for which the anchor can be supplied by...

I like that! For markdown, the tool could just fix the references in a designated area at the bottom of the page. For the source code and TLA+, we would probably still inline links in the comments. Github markdown cheat sheet mentions another syntax: one either uses [FOO.1][1] or just [FOO.1]. How compatible are different markdown dialects?

Sorry I missed this last week. I believe the compatibility of different markdown dialects varies widely depending on the feature being used. I wasn't aware of GitHub markdown'ss support for [foo] as an implicit reference, but that's even nicer :)

konnov commented 4 years ago

I like the pipe syntax |TAG| much better than exclamation marks!

konnov commented 4 years ago

Perhaps we can refine the reuse of the "definition" in heading 4 above to use a different syntax. What it's indicating is that any tags defined underneath this heading should be prefixed with the tag specified in the heading (unless the tag definitions in the paragraph use their fully qualified path).

That would make sense in markdown, but it will be hard to enforce in source-code comments. It is still better to keep tags fully qualified.

thanethomson commented 4 years ago

That would make sense in markdown, but it will be hard to enforce in source-code comments. It is still better to keep tags fully qualified.

Sorry for not clarifying: I was only talking about the Markdown 😁 I was just thinking it'd be more ergonomic and make writing and updating human-language specs less tedious, but references from the code should probably always be fully qualified.

thanethomson commented 4 years ago

re: markdown conventions, it might also be worth considering definition lists for associating tags with logical units.

@shonfeder, can you reference these tags in definition lists? I can't find anything on that in the pandoc documentation.

shonfeder commented 4 years ago

can you reference these tags in definition lists? I can't find anything on that in the pandoc documentation.

Sorry, I think the wording was unclear. What I meant was that, if we are talking about defining tags, then a natural markup (perhaps the most natural markup, semantically) would be definition lists. That would look like

TAG.1

: Tagging a logical unit should be easy.

SYNTAX.1

: We propose a simple naming scheme for tags. We start with
the tags for top-level requirements and then proceed with the tags of the
logical units that implement higher-level requirements.

Part of the rational here is we don't just want to flag the tag as being defined, we also want to semantically encode which part of the following text makes up the definition itself!

Pandoc, with vanilla configs, renders the above as

<dl>
<dt>TAG.1</dt>
<dd><p>Tagging a logical unit should be easy.</p>
</dd>
<dt>SYNTAX.1</dt>
<dd><p>We propose a simple naming scheme for tags. We start with the tags for top-level requirements and then proceed with the tags of the logical units that implement higher-level requirements.</p>
</dd>
</dl>

so we'd have to do some post processing to add anchors to the dt elements. But the nice things is that its crystal clear from the semantics of the document which part is the definiendum and which part the definiens.

thanethomson commented 4 years ago

What I meant was that, if we are talking about defining tags, then a natural markup (perhaps the most natural markup, semantically) would be definition lists.

Yes, I think you're right here, as per this comment, for the general case of defining requirements.

Could we solve the list-based requirement definition problem by way of this kind of syntax?

## 3. Traceability properties

We need a solution that satisfies the following properties:

 1. |TRC-TAG.1| Tagging a logical unit should be easy.
 2. |TRC-REF.1| Referencing a logical unit should be easy.
 3. |TRC-IMPL.1| Marking that one logical unit implements another one should be easy.
...

P.S. This example hurts my mind 😄

SYNTAX.1
< (open line here)
: We propose a simple naming scheme for tags. We start with
the tags for top-level requirements and then proceed with the tags of the
logical units that implement higher-level requirements.
^^ (lack of indentation)

thanethomson commented 4 years ago

One alternative for list-based requirements would be to have the tag specified at the end of the list item. This might make for better readability. It emphasizes the requirement over the tag.

## 3. Traceability properties

We need a solution that satisfies the following properties:

 1. Tagging a logical unit should be easy. |TRC-TAG.1|
 2. Referencing a logical unit should be easy. |TRC-REF.1|
 3. Marking that one logical unit implements another one should be easy. |TRC-IMPL.1|
...

informalsystems / vdd

Proposal on traceability #11