RFC: "One document - one markdown file" (instead of multiple YAML files)

stanislaw commented 5 years ago

This is the issue is related to the previous issue #400. The background for the following proposal can be found there.

1.1) Weak subpoint here: have you ever considered to keep all of the items in ONE markdown file? The idea here would be to use some meta-information around markdown headers like it is done in static website engines that generate html from markdown files based on the meta-information also stored in those markdown files.

This way the levels are ensured by the markdown level headers, no need for level: anymore.

Creating a new Item

To create a new item you simply create a new #...# header and write some statement. Then you call a command doorstop resolve which automatically creates a UID and adds the meta-information. Also, it does the validation of the whole document's meta info for any issues.

No GUI work needed

Having this approach implemented, we don't have to implement any new GUI interfaces because our programmer's IDEs already support markdown and the outlines nicely.

The only problem I see here is the visual redundancy of the meta-information in the markdown file. One solution is to simply auto-collapse all the meta-info by default. I see that Atom has a plugin for collapsing all of the comments in a file when it is opened. Maybe other IDEs have this too.

Pros and Cons

Pros

One document with both content and information, in one file.
Enables code reviews. Code review happens in one single file. Markdown for layout, YAML for meta, the power of both worlds known to developers.
It is easier to see the meta-information and understand its meaning within the context of the full document, not just its item.
No GUI is needed. The new doorstop resolve and existing doorstop * commands are your friend.
Accessible from any IDE that has good markdown support. CLion, Atom, Sublime, PyCharm...

Cons

Change to the Doorstop's original concept. Is it a big change? Is it incompatible with multiple YAML files?
The "noise" of meta-information attached to every markdown header.

Please let me know what you think? I would be happy to implement a POC based on this concept.

stanislaw commented 5 years ago

Here is a screenshot.

JustinW80 commented 5 years ago

See also #295 for concepts of using Markdown files with YAML blocks instead of YAML files. (Though that issue was still considering one .md file per item.)

ghost commented 5 years ago

I am also currently evaluation doorstop and have been thinking along similar lines. In our case, an improved GUI would probably not help. A single file instead of multiple files (yaml or markdown) it would probably help getting people to consider using doorstop.

However, I was approaching this slightly differently. The original rationale for what I am going to outline has to do with how external references (for example to test cases) are handled. Ideally, I would want to be able to treat linkage to test cases and tests in exactly the way how requirements are treated. However, external references don't give me that.

If I were able to embed the equivalent of the yaml file into the test code and somehow make doorstep understand that doorstep information is embedded in another file, I could link back to a test case, which in turn can then be linked to requirements.

From a user's perspective, this could then look like

$ doorstop add REQ –to mysourcefile.c

The command implementation would probably have to create a REQ to file mapping somewhere (e.g. in the respective .doorstop.yml file) and possibly require a unique ID/identifier for the case where multiple documents are stored in the same file. For usability doorstop could maybe copy the yaml block into the clipboard such that it can be added easily to mysourcefile.c

This would then allow for the original use-case raised in this ticket of storing the requirements information in either a single or multiple markdown files assuming #295 is implemented.

Besides this, it would also allow for embedding requirements within code and/or test code as needed and would allow tracing requirements back to code, if needed.

There would probably be some wrinkles around the comment format (which is language dependent) and may need some pre-processing of the content to get the doc from the source file into doorstop. There could also be potential problems with doorstop remove (if you wanted the capability to purge the requirements text in the source file completely).

I don't understand the doorstop code enough to figure out how disruptive this would be, but conceptually this does not sound too bad.

Any thoughts?

stanislaw commented 5 years ago

$ doorstop add REQ –to mysourcefile.c

To complement your description, could you please provide an example of how the mysourcefile.c starts to look like after this command is applied?

ghost commented 5 years ago

To complement your description, could you please provide an example of how the mysourcefile.c starts to look like after this command is applied?

This is where things obviously become language dependent and to some degree coding standard dependent. As a first step (aka without allowing Markdown) it would look like the following assuming standard C comment practice

/* 
 * doorstop: REQ002
 * active: true
 * derived: false
 * header: ''
 * level: 1.1
 * links:
 * - REQ001: a6048621617c0e253c37c7d158263379
 * normative: true
 * ref: ''
 * reviewed: dd1d3e174eb51a5046662b86e8121b90
 * text: |
 *   My requirements text
 */

Note that you need a REQ identifier (the first line of the comment in the block, as you could otherwise not find the correct requirement in the source file)

Other languages and file formats may have different commenting paradigms, which would have to be supported. Such as # or \\. Most of these can probably be handled in the following way

Find the opening identifier, e.g. doorstop: REQ002 - normally the yaml filename gives you that information
Find the end of the comment block (either a whitespace or the */ in the case of C).
Copy everything in between
Strip the first two columns from it - this assumes that generally by stripping columns you can

At that point you have what is now in the yaml file

This area probably needs a bit more thought, as this may imply a set of config settings that trigger different rules based on file extension.

With the markdown idea as laid out in #295 this may look like

/* 
 * doorstop: REQ002
 * active: true
 * derived: false
 * header: ''
 * level: 1.1
 * links:
 * - REQ001: a6048621617c0e253c37c7d158263379
 * normative: true
 * ref: ''
 * reviewed: dd1d3e174eb51a5046662b86e8121b90
 * ...
 * ### My requirements text
 * With a more *detailed* description using [Markdown syntax]
 * (https://www.markdownguide.org/basic-syntax/).
 */

It may also make sense to omit attributes with default values such as header, level, normative which may not be used to minimize what's in the source file. But in this case, the corresponding defaults probably need to be saved in the respective .doorstop.yml file

Taking the same example as before this time in a markdown file with default/unneeded settings removed it would look like

---
doorstop: REQ002
active: true
derived: false
links:
- REQ001: a6048621617c0e253c37c7d158263379
reviewed: dd1d3e174eb51a5046662b86e8121b90
...
### My requirements text
With a more *detailed* description using [Markdown syntax]
(https://www.markdownguide.org/basic-syntax/).

Note that in this case we can't easily identify the end of the requirements description as we basically took this from the comment capability.

A more radical approach would be to implement a more abstract interface that allows you to put custom parsers in place, which parse what's in the text block and just copy it into the correct data structure in doorstop. That would allow implementing the actual format say in doxygen or kern-doc style, which is probably more natural in the code vs. the above.

Given that the above functionality is not in place, this may altogether be more flexible and not that much extra work. Dealing with stripping the comments could also be done in a similar way, possibly avoiding the need for config settings.

AFAICT from the code this functionality is fairly isolated to _load() and _dump() in base.py

sevendays commented 5 years ago

This approach is not suitable for our use case (automotive development): we would use doorstop for all requirements specifications, verification specifications and validation specifications, at vehicle, system, hardware and software levels. The code itself is mostly autogenerated from Simulink.

ghost commented 5 years ago

This approach is not suitable for our use case (automotive development): we would use doorstop for all requirements specifications, verification specifications and validation specifications, at vehicle, system, hardware and software levels. The code itself is mostly autogenerated from Simulink.

My proposal would be that $ doorstop add REQ would work as now and $ doorstop add REQ –to specific-markup-file.md or $ doorstop add REQ –to specific-markup-file.yaml would allow for storing multiple requirements in a single markdown or yaml file.

It would in addition allow for storing requirements in source code (with extra implementation work). This may not be necessary though, if the reference functionality is improved as suggested by https://github.com/doorstop-dev/doorstop/pull/395

With this in mind: would this approach be suitable?

kayoub5 commented 4 years ago

@larskurth You could simplify your logic even further $ doorstop add REQ

if REQ is a folder:
  # Use existing folder logic
else if REQ.md exists:
  # Add to existing REQ.md file

JustinW80 commented 4 years ago

I was considering how to migrate a specification from Word to Doorstop when I had an idea for this RFC:

One Markdown file with header, level, and text data, tagged with a UID
Each item's metadata goes in a sidecar YAML file, one file per item, with the corresponding UID.

Basically, look at the current output from doorstop publish --markdown PREFIX ./PREFIX.md (excluding child and parent links) and consider that as the data file to hold a document's text. UIDs are tagged in curly braces. Each UID tag (say {#REQ-001}) in the REQ.md file would have its own YAML file (in this case REQ-001.yml).

The YAML sidecar file would have the remaining metadata such as links, normative, reviewed, etc.

I'm not sure where to draw the boundary between Doorstop items in this idea. At paragraph breaks would be straightforward, or until encountering the next UID tag. But if a user wants several paragraphs in an item, it may need some more markup, maybe a horizontal rule.

This is sort of how requirements management works in the Word files I've encountered: each "shall" gets a unique number in square brackets, and if you want any other metadata, it's in a requirements traceability table appended to the end of the .docx file.

tangoalx commented 4 years ago

I agree that sometimes it is annoying to have all requirements file-based, especially when building up a new specification or when reviewing.

However, I think that this should remain the foundation of doorstop, because this also keeps the core of doorstop simple and less vulnerable to errors. For me it is outrageous important to have a core application which works well and is not error-prone.

Principally, the core application shall not do more than versioning, tracing and validation. Maybe it has already too much functionality. I really like that concept that each requirement is a stand-alone file, because this integrates perfectly with file-based version control systems. You directly get the history of a requirement out-of-the-box.

Whatever we need further, build it on top. Like this, everyone can decide by himself, on which level her/his project shall be based on.

So, at the moment there is the GUI and the export / import solutions built on top. This already works well I think. You can view / edit all requirements of a document at once and re-import it again.

For me it is rather the question of structuring a document, which is not properly solved. I need an extra file only for each headline? This is the thing which really annoys me. I think that this is the root of why this issue has been filed.

I would like to combine these worlds by keeping the requirements file-based, but allowing the user to define a "structure-of-a-document" or "outline-of-a-document", which allows defining the headlines and how the requirements shall be embedded into the document. I think that the .doorstop.yml files are predestined for this purpose. This is no "on-top" solution, but I think that we need to get rid of the additional YML files, which are only for structuring and giving titles, even if the functionality has been added recently (and it has added a lot of value for very little effort, but it is not comfortable for the user).

I like the idea that the structure of the document is independent of its content. Often the restructuring of documents is not done, because it means a lot of effort and is error-prone, nobody takes the effort and risk to restructure a document, which often results into "difficult-to-understand" documents. And with the version control system you can even simply comprehend how the structure of a document has changed (without the "noise" of content changes).

tangoalx commented 4 years ago

I imagined something like this. Let's say we have a .doorstop.yml, which additionally contains the outline of the document in its text attribute. The references to the requirements are simply markdown links. Besides the outline, you can insert whatever you want, like a use case diagram or further explanations.

settings:
  digits: 3
  parent: ''
  prefix: REQ
  sep: '-'
header: |
  Software Requirement Specification
reviewed: ''
text: |
  # Purpose

  Why do I need this document, what is it for?

  # Scope

  What is the border of consideration?

  # Use Cases

  ` ` `plantuml
  @startuml
  left to right direction
  actor User
  actor Admin
  rectangle ATM {
  User --> (Withdraw Money)
  User --> (Deposit Money)
  User --> (Check Balance)
  (Withdraw Money) .> (Reduce Balance) : include
  (Deposit Money) .> (Rise Balance) : include
  (Withdraw Money) ..> (Select Account) : include
  (Deposit Money) ..> (Select Account) : include
  (Check Balance) ..> (Select Account) : include
  Admin --> (Refill Money)
  Admin --> (Receive Low Fill Level Notification)
  }
  @enduml
  ` ` `

  # Requirements

  ## Select Account
    - [REQ-000.yml](REQ-000.yml)

  ## Withdraw Money
    - [REQ-001.yml](REQ-001.yml)
    - [REQ-002.yml](REQ-002.yml)

  ## Deposit Money
    - [REQ-003.yml](REQ-003.yml)
    - [REQ-004.yml](REQ-004.yml)

  ## Balance
    - [REQ-005.yml](REQ-005.yml)
    - [REQ-006.yml](REQ-006.yml)

  ## Refill Money
    - [REQ-007.yml](REQ-007.yml)
    - [REQ-008.yml](REQ-008.yml)

  ## Low Fill Level Notification
    - [REQ-009.yml](REQ-009.yml)

And for example a file REQ-000.yml:

active: true
derived: false
links: ''
ref: ''
reviewed: 3bc2aa8c304c71e390e8fabb660995f0
text: |
  The ATM shall demand the user to select a bank account before
  the user can progress with an operation on that bank account.

  The user shall insert a valid bank card and input the pin number.

  The ATM shall disallow the selection, if the user provided an
  invalid bank card or an invalid pin number.

Then it may look like this:

Screenshot from 2020-01-26 21-56-53

neerdoc commented 2 years ago

I agree mostly with @tangoalx but I would simply have a list of items to create the outline and levels. Leave all content in each individual file to keep traceability.

settings:
  digits: 3
  parent: ''
  prefix: REQ
  sep: '-'
header: |
  Software Requirement Specification
reviewed: ''
text: |
  # REQ-002
  REQ-004
  REQ-001
  ## REQ-070

Content of each paragraph/heading can be controlled and traced by each individual file. But the re-ordering of the document is done easily in the .doorstop.yml file. Also, I think that the level attribute will be unneeded if this is implemented.

doorstop-dev / doorstop