Improve MOOSE documentation system(s)

idaholab / moose

Multiphysics Object Oriented Simulation Environment

https://www.mooseframework.org

GNU Lesser General Public License v2.1

1.73k stars 1.04k forks source link

Improve MOOSE documentation system(s) #6699

Closed aeslaughter closed 7 years ago

aeslaughter commented 8 years ago

Description

We need an improved and more integrated method for developing and distributing documentation. There are a key features needed:

The pages (i.e., wiki and current website) should be in a common format and be stored in a git repository so we can edit them locally and use PRs for reviewing the changes.
The in code documentation needs to build links to the website/wiki, such that Peacock and atom can have links to get more information. For example, if you are adding a Kernel it Peacock will show a link to where more documentation exists for the class and its parameters (see #6680).
Rationale

Application developers, SQA requirements, struggles with our current system, and complaints about the lack of documentation indicate that we need to do better.

Identified impact

We will probably need to re-format much of our content and improve in-code docs.

aeslaughter commented 8 years ago

@idaholab/moose-developers Please start dumping ideas for documentation solutions and requirements to what that system should look like, I will move the stuff from the email over.

aeslaughter commented 8 years ago

Here are some ideas.

Use a pure Doxygen solution that improves the look of the documentation and utilizes the "related pages" feature (e.g., https://www.biogearsengine.com/documentation/index.html)
Use a website system with github pages and modify the system to meet our needs: http://dynalon.github.io/mdwiki/#!index.md
Use ReadTheDocs, which can be self hosted. An advantage to this is that it supports reStructuredText which is a richer language than Markdown, there are also tools built that work with it for presentations: Hovercraft

friedmud commented 8 years ago

My issue with all of this is that changing our documentation methods does not mean we're going to get more, or higher quality, documentation than we have now.

If we would just actually USE our current stuff and actually write documentation... We would be fine.

This feels like a new "Python side project" that will generate more stuff that needs to be maintained while just shuffling around the pieces and not actually generating any new documentation. The time and effort we could put into designing and developing this new system could be put into actually documenting the code... On Wed, Apr 6, 2016 at 8:54 AM Andrew E Slaughter notifications@github.com wrote:

Here are some ideas.

1.

Use a pure Doxygen solution that improves the look of the documentation and utilizes the "related pages" feature (e.g., https://www.biogearsengine.com/documentation/index.html) 2.

Use a website system with github pages and modify the system to meet our needs: http://dynalon.github.io/mdwiki/#!index.md 3.

Use ReadTheDocs http://read-the-docs.readthedocs.org/en/latest/, which can be self hosted. An advantage to this is that it support reStructuredText which is a richer than Markdown, there are also tools built that work with it for presentations: Hovercraft http://regebro.github.io/hovercraft/#/step-1

— You are receiving this because you are on a team that was mentioned.

Reply to this email directly or view it on GitHub https://github.com/idaholab/moose/issues/6699#issuecomment-206357496

dschwen commented 8 years ago

Yeah, what the guy above me said. We have a wiki, that has a really low threshold for participation. All the proposals above sound like a lot more hassle to contribute documentation. :-/

permcody commented 8 years ago

Well if you don't like having Python side projects. Let's try making a Perl side project!

I'm afraid that we're after more than just documenting the code. No, we won't be "fine" if that's what we are trying to do. Let's switch gears and actually talk about "requirements". I think we've already lost sight of why this even came up in the first place.

Each project is required to maintain the following documents:

User Manual
(optional) Theory Manual
Software Requirements Specification (SRS)
System Design Document (SDD)
Software Test Plan (STP)
(optional) Requirements Traceability Matrix (RTM)

These are per project documents! The umbrella planning documents that I maintain do not remove the requirement for maintaining these documents for each NQA-1 application.

Now in typical MOOSE fashion, we are trying to design a system that will allow users to just document both their code and capture the purpose of their code in an easy to do, easy to maintain way so that we can mostly generate the required documentation. I've already started this process and have a small script that partially generates the RTM from existing "markup" in the. Now that the other applications are starting to do this we are trying to figure out how to make this consistently. This is what MOOSE is supposed to be about. I don't want our scientists and engineers to worry about being Software Designers now too. I just want them to follow a well-defined process of tagging or marking up the documentation they do create so that we can build most of these documents automatically.

Now let's talk about ideas...

friedmud commented 8 years ago

To me, those are all a different kind of documentation. Separate from "normal" documentation. Users (whether of MOOSE or of a particular application) aren't going to want to read those documents. They can be in a completely separate form from our actual documentation.

I mean, are you saying you want to try to capture stuff like this: http://mooseframework.org/wiki/PhysicsModules/TensorMechanics/PlugAndPlayMechanicsApproach/

In the same system you do a "Requirements traceability matrix" or a "software requirements specification"?

There is nothing easier than editing our current Wiki... and yet we very rarely do (especially the MOOSE team - with myself being one of the biggest problems: there is STILL no proper documentstion of VectorPostprocessors!). If there is a whole complicated system that includes needing to do pull requests or run custom software or whatever... then we have no hope.

permcody commented 8 years ago

Whether we like it or not we are contractually obligated to follow NQA-1. We are also the poster children for doing this right so we will get plenty of accolades for driving this process. We have at least three applications and likely more that really are being targeted at nuclear related application deployments potentially with an impact on safety so we can't even grade ourselves out of these requirements if we wanted to. Some of these applications are moving beyond research and this does increase the burden on the application developers but also places as more burden on the end users. If we build the right system, the impact will be less than if we just tell the application owners to just build all of these documents statically and load them into a records management system. What's worse is if we really do build separate mostly-static documents we all know that they won't be kept up to date and we will likely fail an audit due to our documents being behind our M&O on the software. We've already failed one for this very reason and we aren't even deployed!

Guess what? Changing documentation in a deployed system needs a review just like changing source so it's natural to use our normal doxygen so that there's no difference to the developer between building source and building docs. I hate to admit it but I've actually read large parts of the software specific NQA-1 document.

OK so that's all the doom and gloom. Let's look at the brighter side. I don't think that this has to be painful, we can do whatever we'd like and defend our practices. Seriously, we can innovate new ways of documenting and creating even better agile practices like we've been doing all along. I'd much rather build a system that can generate these documents from a combination of boilerplate templates, scripts and markup than sitting down and writing a document that's completely separate from everything else we do in our daily work. What this means to the developer is that they don't have to look at these generated SQA documents ever for all I care. The team lead most likely will from time to time, but not every single developer who creates code.

I really do want to use the current system as much as we can and leave a lot of the documentation where it is. I just want to come up with a way of consistently marking it so that it can be pulled together by some set of scripts. I'm not reinventing the process.

permcody commented 8 years ago

Note: I appended two more documents to the required list. I believe that a "user manual" and possibly a "theory manual" are required as well. Technically we only need one but I believe most of the applications are going to build two. For the framework we can probably just get away with one and I don't even know what that looks like yet.

Also, I am working very closely with an active NQA-1 committee member in the rewrite of the umbrella documents. She likes our current processes, she likes working with us. I have full confidence that we can drive this process if we work together.

dschwen commented 8 years ago

Each project is required to maintain the following documents:

User Manual

(optional) Theory Manual

Software Requirements Specification (SRS)

System Design Document (SDD)

Software Test Plan (STP)

(optional) Requirements Traceability Matrix (RTM)

I can definitely see the value of assisting the apps with points 3 through 6! However, forcing a scheme for 1 and 2 on apps seems like a bad idea. A user manual benefits hugely from easy editing access. I see a typo I fix it on the wiki. I have some down time I rework a doc page. A user cannot find the docs, I immediately add a link. etc. Those are all actual things that happened for me in the last week alone. I absolutely do not want to open an issue and file a PR to do these things.

andrsd commented 8 years ago

This is related to NQA-1 only and what I did in RELAP-7:

For SRS: I have an XML document with the "database" of requirements. I use xslt to turn this into a .tex which is then included in the SRS TeX doc. I assume this will be more or less the same for any app. However, they can supply their of template if they need more/less.
The XML db is also used for RTM, where I cross link it to the tests. I modified Cody's script that extracted the into from a .tex file. It is a bit easier with XML. This keeps the same method of finding @requirement tag in our .i files. I do not generate a PDF from the script like Cody does, but I do .tex again and include it in the RTM TeX document.
The piece that is missing is to link requirements to design. I was thinking a doxygen-like tag that would correspond to the requirement id from the XML doc. That way we would know which classes participate in the specified requirement and we can generate something for SDD. That something would be formatted by some template where we fill in its pieces. But I still do not know what exactly would need to go in this part.

I'd prefer to keep most these NQA-1-related pieces as doxygen-like comments in the code. They might be helpful when people rad the code without chasing that in the SQA docs. But, not sure how viable this is.

aeslaughter commented 8 years ago

As person that writes a good portion of the documentation, I think one of the reasons nobody creates documentation is that it is a somewhat painful processes. Creating large changes or adding large amounts of information to the wiki makes me angry and the presentation blaster is functional but not very robust.

Also, navigating the various sources of documentation: Doxygen, website, wiki, input file, pdf slides is daunting and people do not do it because it is not natural. Thus, they believe our documentation is terrible. I actually think we have quite a bit of documentation, but knowing where it is and how it connects to the code I am using and writing does not exist.

I would like us to have some sort of unified approach for making documentation for users, developers, and NQA. I believe the wiki/blaster approach is on the right track, I just believe we can do better and make it part of MOOSE that sets us apart. It should be something that attracts people to MOOSE and they want to use for their own projects, not something that keeps them from MOOSE which is where we are at now, in my opinion.

rcm59 commented 8 years ago

There is no discussion on whether we are going to incorporate SQA into our software development practices. Furthermore, we will follow NQA-1 practices. NQA-1 software development requires:

User Manual
(optional) Theory Manual
Software Requirements Specification (SRS)
System Design Document (SDD)
Software Test Plan (STP)
(optional) Requirements Traceability Matrix (RTM)

Your only issue is how you would like to meet these NQA-1 documentation requirements.

bwspenc commented 8 years ago

Sorry I'm late to the discussion -- I just realized that it was going on here since I'm behind on my email. Others on our teams may be interested @jasondhales @acasagran @backmari @novasr @sapitts @cpritam

We already have two code projects (BISON and Grizzly) that share a lot of common material. It is crazy to not have that common material live in the same place, which should be as close to the source code as possible. This is no time-wasting side project -- this will save us a lot of time maintaining our manuals.

The single source of the documentation could in theory be the wiki, but these are my main issues with having it be on the wiki:

No peer review of changes. Sure it's easy for the developer to edit it, but it's also just as easy for some random person on the Internet to edit it (which is probably a bad idea for an NQA-1 code). We could implement a peer-review process on the wiki, but why go to the trouble when we already have a peer-review process for our code repository?
A lot of the documentation actually already lives in the source code (the doc strings for line commands). If those are documented in the wiki, they will go out of date.
It's far-removed from the source code, making it harder to keep up to date with code changes. It's very difficult to hold developers accountable to change documentation to match the code.

If the documentation lives with the source code, peer review automatically happens. We may need to expedite the process of reviewing documentation-only pull requests if we're working on a deadline to get a manual out. We can still take in contributions from anyone as we can on the wiki, but through the same process we already have in place for code.

We recently released our first version of the Grizzly manual. In an effort toward having a single source for the documentation that we could use for displaying on the web or in a pdf manual, we wrote everything in Markdown, and used Pandoc in our Makefile to convert it to LaTeX. The intent was that we would then move all of that documentation to the wiki pages (which hasn't happened yet). The system works pretty well, but it represents significant duplication because we are still repeating everything that is in the doc strings for the parameters in the code.

Our manuals consist of two types of documentation:

Higher-level descriptions of theory and how to use a set of objects together
Descriptions of what the individual MOOSE classes that we use do and what their parameters are.

Both of these can easily live in our repository. The higher level descriptions can be in separate Markdown files that live somewhere in the repository. I think the best place for the documentation specific to the classes is in the classes themselves. We already have documentation strings. All we need is a place to put a few paragraphs describing what the class does and any theory behind it. The added benefit is that the documentation is right there for anyone who's looking through the code to explain the class they're looking at.

The only thing we're missing to generate the manuals the way I would like to is to have a way to extract the documentation in the code for a specific class in a format suitable for inclusion in a manual. The --dump option isn't far off from what we need, so I can't imagine it would be too hard to do. I'd preferably like it dumped out in Markdown so that it would be easy to include in a web page or pdf document. It should include a summary of usage and theory of the class, as well as the doc strings for the available commands, and there should be a way of filtering the commands to be shown.

If I had that, for my app's manual, I would just need to define which classes to include and how to arrange that information. I would include these automatically generated Markdown files documenting specific classes as well as the higher-level Markdown files that live in the repository. Something similar could be done to provide web version of the manual that would look a lot like the current wiki pages (except that you wouldn't be able to edit them). We already have much of the material that we use in the modules used by BISON and Grizzly in Markdown format ready to move into the code base.

friedmud commented 8 years ago

@bwspenc thank you for the thoughtful and explanatory post. That all makes sense.

I really like the idea of making Markdown our main "language" for documentation. It makes it easy to read on the command-line and easy to parse using Python/Pandoc/etc. for inclusion into larger documents or transforming to a website.

In my mind there are still two separate pieces of documentation: user documentation and developer documentation. What I mean by that is this:

User documentation: theory manual, input file syntax and explanation about what objects do.
Developer documentation: design documentation about how an object works or why it was constructed the way it is constructed. How to extend it or override pieces of it to do something different. Anything out of the ordinary (code-wise) that should be pointed out about an algorithm.

In my mind these two things should stay separate... both in terms of the documents we produce... but also in the way we store the documentation.

I would like to see User Documentation continue to grow using the InputParameters system. I feel like it already does a good job of capturing the documentation for parameters and it is the perfect place to put higher level descriptions of what an object does. Those descriptions will come out with a --dump like argument including all of the filtering options @bwspenc mentioned (that's a great idea). We can grow a (small!) set of scripts around that to help transform it into whatever form we want (PDF, webpage, Peacock, etc.).

In addition, I like the idea of a theory manual (and/or user manual) written in Markdown that is hosted inside the repository. I know I was voting for using the Wiki earlier but I think @bwspenc made some really great arguments about why documentation should live with the code (in particular, the part about review). We can have an automated system that turns those markdown pages (along with output from --dump) into manuals.

For developer documentation I think we should focus on Doxygen. In Doxygen I would like to describe APIs and code design things... NOT mathematics. Sure, a simple "This class implements a Laplacian operator" is not a bad idea. But save the full explanation for what the object does for the User documentation. Doxygen should be about how it does what it does and what the design intention is behind the object. How to interact with it and any other pertinent information a developer would need in order to maintain or extend the object.

In short: I think we are closer than we thought. We only need small code enhancements and some policy changes in order to implement most of this. The main thing I didn't want is an enormous upheaval in the way we work...

YaqiWang commented 8 years ago

The key functions I will need for enhancing InputParameters are:

[ ] to dump parameters type, short description, input syntax, default value, coloring based on the groups they belong to, the place where the class is defined (framework, modules, app, etc.).
[ ] to provide extra usage notes on parameters. This should allow including and referring equations.
[ ] to have a class description. It should also allow equations.
[ ] to provide capability of cross referring parameters. As an example, several derived classes from the same base class sharing the parameters declared in the base class but the base class by itself cannot be a class in the user manual. Then one of the class will be the place displaying those base parameters and others will be just referring those parameters. Steady and Transient executioners can be an example of this. Class description may refer class parameters.

If all of these are in place (I think we are not far), about half of Rattlesnake user manual can be automatically generated. It can save our team tremendous development efforts and make easy for keeping manual and code synchronized!

The other two third of the rest Rattlesnake user manual is actually tutorial inputs, which set up some well-known benchmark problems with the input files. Each tutorial includes the problem description, what is new, explanation of all of the input blocks, and representative results. I think it will be good to include all of these information in the input files and have a way to automatically extract the information for generating tex files. We have treated these tutorials also heavy tests scheduled nightly.

I have not thought about theory manual carefully, but I am afraid that theory manual especially the equation part of it does not have to go along with codes. It could be chunky and written separately. Theory manual can refer classes for kernels, materials, etc., which seems to me part of the document generated by Doxygen. We possibly can detect broken links from theory manual to the doxygen doc and inform the theory manual author to make correction due to code changes.

aeslaughter commented 8 years ago

On Fri, Apr 8, 2016 at 5:54 PM, Derek Gaston notifications@github.com wrote:

@bwspenc https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_bwspenc&d=BQMCaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=h7heP8xwI1i_HikChvhFbEBurKirgfOCdwgBxB9lM8c&m=BcWM_fgeeP4sZJL33f7HZQxde5crgF89wJhu3N45F0A&s=7Ug8kLBtMp4j2dCmYpVN6Gpg7ormiMhh4bK0PCUc_OQ&e= thank you for the thoughtful and explanatory post. That all makes sense.

I really like the idea of making Markdown our main "language" for documentation. It makes it easy to read on the command-line and easy to parse using Python/Pandoc/etc. for inclusion into larger documents or transforming to a website.

In my mind there are still two separate pieces of documentation: user documentation and developer documentation. What I mean by that is this:

User documentation: theory manual, input file syntax and explanation about what objects do.

Developer documentation: design documentation about how an object works or why it was constructed the way it is constructed. How to extend it or override pieces of it to do something different. Anything out of the ordinary (code-wise) that should be pointed out about an algorithm.

I think there are three groups:

Users: who just run MOOSE based applications and never write code.

Application Developers: people making Apps, they need some access info about the code, but not really all of it (this is why I want a way to tag code for app developers so all the other stuff is hidden)

Moose Developers: us an others trying to add to MOOSE.

I plan on mocking up a html/--dump output for next week for what I envision, but I think that @friedmud is on the right track with how the docs should be developed: markdown/doxygen hybrid.

In my mind these two things should stay separate... both in terms of the documents we produce... but also in the way we store the documentation.

I would like to see User Documentation continue to grow using the InputParameters system. I feel like it already does a good job of capturing the documentation for parameters and it is the perfect place to put higher level descriptions of what an object does. Those descriptions will come out with a --dump like argument including all of the filtering options @bwspenc https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_bwspenc&d=BQMCaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=h7heP8xwI1i_HikChvhFbEBurKirgfOCdwgBxB9lM8c&m=BcWM_fgeeP4sZJL33f7HZQxde5crgF89wJhu3N45F0A&s=7Ug8kLBtMp4j2dCmYpVN6Gpg7ormiMhh4bK0PCUc_OQ&e= mentioned (that's a great idea). We can grow a (small!) set of scripts around that to help transform it into whatever form we want (PDF, webpage, Peacock, etc.).

In addition, I like the idea of a theory manual (and/or user manual) written in Markdown that is hosted inside the repository. I know I was voting for using the Wiki earlier but I think @bwspenc https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_bwspenc&d=BQMCaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=h7heP8xwI1i_HikChvhFbEBurKirgfOCdwgBxB9lM8c&m=BcWM_fgeeP4sZJL33f7HZQxde5crgF89wJhu3N45F0A&s=7Ug8kLBtMp4j2dCmYpVN6Gpg7ormiMhh4bK0PCUc_OQ&e= made some really great arguments about why documentation should live with the code (in particular, the part about review). We can have an automated system that turns those markdown pages (along with output from --dump) into manuals.

For developer documentation I think we should focus on Doxygen. In Doxygen I would like to describe APIs and code design things... NOT mathematics. Sure, a simple "This class implements a Laplacian operator" is not a bad idea. But save the full explanation for what the object does for the User documentation. Doxygen should be about how it does what it does and what the design intention is behind the object. How to interact with it and any other pertinent information a developer would need in order to maintain or extend the object.

In short: I think we are closer than we thought. We only need small code enhancements and some policy changes in order to implement most of this. The main thing I didn't want is an enormous upheaval in the way we work...

— You are receiving this because you authored the thread. Reply to this email directly or view it on GitHub https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_idaholab_moose_issues_6699-23issuecomment-2D207650700&d=BQMCaQ&c=54IZrppPQZKX9mLzcGdPfFD1hxrcB__aEkJFOKJFd00&r=h7heP8xwI1i_HikChvhFbEBurKirgfOCdwgBxB9lM8c&m=BcWM_fgeeP4sZJL33f7HZQxde5crgF89wJhu3N45F0A&s=5xPNxxvSfP2EWzBcNrB7ui-81EPMVngkTR6ftQ9wqB8&e=

aeslaughter commented 8 years ago

I am moving our meeting to Thur.

aeslaughter commented 8 years ago

@idaholab/moose-team I was thinking about the training vs. documentation a bit and thought of a possible solution.

How about each page, which slides are needed, has summary section. This section would be translated to slides. I understand the need for two forms of content: slides and prose, but it sure would be nice and easier to maintain if they were in the same location.

It may even be that the summary is hidden on the website and only shows up on the slides, which could also be generated and posted on the site somewhere else.

Just starting the discussion to see where we want to take this.

# Kernels
## Summary 

### Slide 1
* Represent physics
* Are amazing

### Slide 2
* You should make some

## Description
The book goes here...

friedmud commented 8 years ago

Not a bad idea. I worry that many people won't make it past the summary... but we could try it!

aeslaughter commented 7 years ago

I am calling this done, see #8329 for future issues.

idaholab / moose

Improve MOOSE documentation system(s) #6699

Description

Rationale

Identified impact