MCP-0018: Change of the Specification Document Format

modelica / ModelicaSpecification

Specification of the Modelica Language

https://specification.modelica.org

Creative Commons Attribution Share Alike 4.0 International

104 stars 40 forks source link

MCP-0018: Change of the Specification Document Format #1730

Closed modelica-trac-importer closed 4 years ago

modelica-trac-importer commented 6 years ago

Modified by dietmarw on 2 Jul 2015 10:36 UTC This is a ticket discussion changing the documentation format from a binary documentation format to some sort of markup language.

The proposal is to use Sphinx; docbook, asciidoc, and markdown have been considered.

The following documents (also attached) show what a simple conversion during the meeting could accomplish: https://dev.openmodelica.org/~marsj/MLS-rst/html/ https://dev.openmodelica.org/~marsj/MLS-rst/singlehtml/ https://dev.openmodelica.org/~marsj/MLS-rst/latex/ModelicaLanguageSpecification.pdf

The sources (with verified working Windows installation instructions) are available at: https://github.com/sjoelund/MLS-rst-spec

Note that you do not need to install anything to modify the source code. Just to verify that the final output is what you expect it to be.

Document location

Modified by dietmarw on 12 Jun 2015 14:29 UTC This is a ticket discussion changing the documentation format from a binary documentation format to some sort of markup language.

The proposal is to use Sphinx; docbook, asciidoc, and markdown have been considered.

The sources (with verified working Windows installation instructions) are available at: https://github.com/sjoelund/MLS-rst-spec

Note that you do not need to install anything to modify the source code. Just to verify that the final output is what you expect it to be.

Document location

https://svn.modelica.org/projects/MCP/public/MCP-0018_ChangeSpecificationDocumentFormat/

Reported by sjoelund.se on 10 Jun 2015 04:09 UTC This is a ticket discussion changing the documentation format from a binary documentation format to some sort of markup language.

The proposal is to use Sphinx; docbook, asciidoc, and markdown have been considered.

The sources (with verified working Windows installation instructions) are available at: https://github.com/sjoelund/MLS-rst-spec

Note that you do not need to install anything to modify the source code. Just to verify that the final output is what you expect it to be.

Migrated-From: https://trac.modelica.org/Modelica/ticket/1730

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 10 Jun 2015 11:48 UTC otter wrote some text in an email; replying here without providing context.

Sphinx refers to https://sphinx-doc.org

Binary format vs. text-based format refers both to:

Being a human-readable (and editable) format in an any text editor.
Being a textual asset, which makes it friendly to version control (you cannot automatically merge docx or OpenOffice files if two different users edited the files at the same time). This saves both bandwidth and makes it simple to see what a particular commit actually changed.

As for installation instructions, we tried it with several users during the meeting and noone had any difficulty setting up the environment. However, since it is text-based it is not necessary to have the environment installed.

It is even possible to edit the source files directly inside the github repository (requires a free account), where you get a limited preview of the result. GitHub does not handle all aspects of sphinx, but it does know restructured text (the markup syntax), and it can display rich text diffs such as: https://github.com/sjoelund/MLS-rst-spec/pull/3/files?short_path=05faa38#diff-05faa38eebafa3c1346abc1f6b0b5996 https://github.com/sjoelund/MLS-rst-spec/commit/d2f59241e1a2e2aaf08be1d10309339181ae3652?short_path=05faa38#diff-05faa38eebafa3c1346abc1f6b0b5996 https://github.com/sjoelund/MLS-rst-spec/commit/315b70004e5b326b1c9692131764873eb7b247f3?short_path=380ccbc#diff-380ccbcdcd7610b00399b04731ec1a8a

make.bat is part of the root of the repository, not the "source" directory (where the sources are; the instructions look correct to me, but it's easy to navigate to the incorrect place).

As for the errors with the tables, did you notice that the document lists 3 alternative ways to define that table? The other 2 look a lot nicer (using definition lists over tables in particular).

modelica-trac-importer commented 6 years ago

Comment by mtiller on 10 Jun 2015 14:34 UTC Via email, Dag made the following comments (which I am including here just for context):

On Wed, Jun 10, 2015 at 9:23 AM, BRÜCK Dag Dag.BRUeCK@3ds.com wrote: Hi,

I have looked at the Sphinx documentation system. I'm sure its great if you're developing Python code.

As a system for writing general documents, I think it is absolutely God-awful if you plan for everybody to use the mark-up language ("reStructuredText"). It is a primitive, badly structured imitation of TeX or Scribe; just like the half-dozen of homemade text formatting tools I saw in the 1980s. From a usability point of view it is much worse than Troff with let's say MM or MS macros.

Let me add a few comments based on actual use. ;-)

So I would actually agree that among markup languages, reStructuredText is not as simple and easy to use as other markup languages. Like Martin S., I surveyed the available tools before writing "Modelica by Example". I would have loved to use something nice like "Markdown". But the problem is that such formats are not rich enough for documents that must support both PDF and HTML formats.

For that reason, I chose Sphinx and rST for "Modelica by Example". Sphinx is far from half-baked. The rST format is not as simple as Markdown, but it includes the ability to express essential things like format neutral cross referencing, selection of appropriate media types for figures, inline inclusion of source code, etc. Further more, it is extensible which, in practice, is often necessary.

I would also point out that Sphinx is more than just rST. It includes support for various important targets (HTML, PDF and ePub being the most important for my work), handling of mathematical equations, syntax highlighting, etc. It is widely used as well which means that it is fairly easy to get answers to common questions.

The bottom line is that I've actually written a large amount of documentation using Sphinx and specifically on the topic of Modelica. While Sphinx is not perfect, I found it to be the best solution among the options I looked at it and I (personally) have no complaints or reservations about using it.

This isn't my proposal but I do feel quite strongly that moving to a text based format would greatly improve the process of collaborating on the specification. My experience with "Modelica by Example" was a great demonstration of how easy it becomes for people to collaborate using such an approach. I've received 128 pull requests for my book and incorporating them has been very easy. I could easily imagine the MCP and language ticket issues being resolved in a very nice way with such a system.

modelica-trac-importer commented 6 years ago

Comment by otter on 10 Jun 2015 16:38 UTC I did not yet manage to run the example from Martin (S.) (https://github.com/sjoelund/MLS-rst-spec) on my Windows machine. This might be, because I do not have administrator rights and it is not possible to modify the path variable. Here are more details:

When running "C:\Python27\Scripts\pip.exe install sphinxcontrib-inlinesyntaxhighlight" this was not successful. From the error message it seems, that this command tried to install a "Linux" version ("Downloading sphinxcontrib-inlinesyntaxhighlight-0.2.linux-x86_64.tar.gz"). I went to this web-page, and there was another "general" installer. This I downloaded manually and then the installation was successful.

When running "make html", the following error message appeared:

Running Sphinx v1.1.3
loading pickled environment... not yet created

Theme error:
no theme named 'alabaster' found (missing theme.conf?)

Searched what "alabaster theme" means and went to this page "https://pypi.python.org/pypi/alabaster#downloads". Used the recommended command "C:\Python27\Scripts\pip.exe install alabaster", but installation seemed to be not successful. Downloaded the installer manually and installed via setup.py. This time, installation seemed to be sucessful.

Running "make html", gives still the error message from above.

I am giving up now. Recommendations to finalize this installation are welcome.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 12 Jun 2015 10:39 UTC We've now updated the batch files for the IT(department)-challenged people. If you clone/pull https://github.com/sjoelund/MLS-rst-spec there are now two files that you can simply click:

installSphinx.bat will automatically install locally python 2.7.10 (we included the install binary in the repo to make it fail proof for now) and all the necessary sphinx dependencies.
make-html.bat is simply a wrapper for make.bat html which lets you generate the HTML with a simple double-click.

Please let us know if there are still problems. This has now been tested on Windows 7 and Windows XP.

modelica-trac-importer commented 6 years ago

Comment by dag on 12 Jun 2015 11:10 UTC But nobody has addressed this question: are there any reasonable editing tools? Those of you that use Sphinx, what editor do you use?

I'm not arguing the file format; that is a separate issue. I just wonder if your proposal means that we have to type mark-up commands and run a compiler to see the result, or if there are any modern WISIWYG tools.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 12 Jun 2015 11:39 UTC Replying to [comment:5 dag]:

But nobody has addressed this question: are there any reasonable editing tools? Those of you that use Sphinx, what editor do you use?

I use geany; same as for all documents I edit. Note that there is not a lot of markup that should be performed in the document. There are restructured text editors out there, but they will naturally not support things like creating Modelica code blocks (which is what does most of the formatting) in the text. So even with an editor, you would at some point need to edit the markup at some point in time.

I'm not arguing the file format; that is a separate issue. I just wonder if your proposal means that we have to type mark-up commands and run a compiler to see the result, or if there are any modern WISIWYG tools.

You mean WYSIWYG, right? There cannot exist such a tool for restructured text because of a few reasons. The primary reason is that it is not a typesetting system. You can get a preview of what the text could look like. This is similar to Word, which is also not WYSIWYG (if you generate a pdf from it and someone else does the same thing, the results are sometimes slightly different).

modelica-trac-importer commented 6 years ago

Comment by otter on 12 Jun 2015 12:00 UTC Replying to [comment:4 dietmarw]:

We've now updated the batch files for the IT(department)-challenged people. If you clone/pull https://github.com/sjoelund/MLS-rst-spec there are now two files that you can simply click: * installSphinx.bat will automatically install locally python 2.7.10 (we included the install binary in the repo to make it fail proof for now) and all the necessary sphinx dependencies. * make-html.bat is simply a wrapper for make.bat html which lets you generate the HTML with a simple double-click.

Please let us know if there are still problems. This has now been tested on Windows 7 and Windows XP.

Thanks very much. I tried and achieved the following:

Python and needed packages, including sphinx installed without problems.
When running make-html.bat there was an error (related to amsmath.sty). By accident, I hit the wrong button and the window was gone. When running it again, an error did no longer occur. An html file of the (partial) Modelica specification was generated (which looks nice). When looking in it there are some "orange" boxes that mark an error. As I understand, LaTeX is used to generate some representation of a formula for html, and this failed. It means that LaTeX must be installed on your computer before this can succeed. On my computer MikTex is installed. However, there have been improvements to MikTeX and the file amsmath.sty was removed. It seems that MikTeX (via Sphinx) tried to install the file automatically from the net, but the address did not work) and there the error. I tried to install it manually via the MikTeX package manager, but the file does not occur. I found a web page (http://latex-community.org/home/news/46-news-latex-distributions/469-miktex-amsmath) where this is described. It seems amsmath.sty was removed from the MikTeX distribution due to another (better) solution. On this web-page instructions had been given, how to re-introduce it. I could follow these instructions and install the file (and hereby giving a different server on the net where to download; so the main issue was probably that a web service was not available). Afterwards make-html.bat worked and generated the html without errors.

Please, add some instructions to the installation, that a LaTeX distribution needs to be installed on the computer (on Windows preferably http://miktex.org/), and give the above web-address in case an amsmath.sty error occurs. Maybe you manage to rewrite the Sphinx definition, so that amsmath.sty is not used, to reduce the source of confusion

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 12 Jun 2015 12:07 UTC Replying to [comment:7 otter]:

Please, add some instructions to the installation, that a LaTeX distribution needs to be installed on the computer.

Actually, it is not strictly speaking needed. I intend to change the default configuration to mathjax, which avoids the need to have LaTeX installed (requires JavaScript and an internet connection instead). For the release builds, generation of static SVG's (the current approach) will be used.

modelica-trac-importer commented 6 years ago

Comment by otter on 12 Jun 2015 12:12 UTC Replying to [comment:6 sjoelund.se]:

Replying to [comment:5 dag]:

But nobody has addressed this question: are there any reasonable editing tools? Those of you that use Sphinx, what editor do you use?

I use geany; same as for all documents I edit. Note that there is not a lot of markup that should be performed in the document. There are restructured text editors out there, but they will naturally not support things like creating Modelica code blocks (which is what does most of the formatting) in the text. So even with an editor, you would at some point need to edit the markup at some point in time.

I'm not arguing the file format; that is a separate issue. I just wonder if your proposal means that we have to type mark-up commands and run a compiler to see the result, or if there are any modern WISIWYG tools.

You mean WYSIWYG, right? There cannot exist such a tool for restructured text because of a few reasons. The primary reason is that it is not a typesetting system. You can get a preview of what the text could look like. This is similar to Word, which is also not WYSIWYG (if you generate a pdf from it and someone else does the same thing, the results are sometimes slightly different).

Your statement is misleading. At some point in time, the restructured text is rendered, latest in the web browser. What a user wants to see is exactly this rendered text, when he/she makes a change. For html, there are several of such WYSIWYG editors (like KompoZer, or Sigil in the public domain, and much more commercial onces). Since html is much more rich as the Sphinx markup language, it must be much simpler to build such an editor for Sphinx. Of course every web browser displays the same source a little bit differently, and therefore there are also small changes for the renders in these editors. Still, for a human these small changes to not matter at all, and the result is of course ORDERS OF MAGNITUDEs better as looking on the raw markup text.

At the web-meeting yesterday, a WYSIWYG editor was mentioned and quickly showed. Is this the one that you mention above, or is there a different one.

modelica-trac-importer commented 6 years ago

Comment by dag on 12 Jun 2015 12:36 UTC Replying to [comment:6 sjoelund.se]:

Replying to [comment:5 dag]:

But nobody has addressed this question: are there any reasonable editing tools? Those of you that use Sphinx, what editor do you use?

I use geany; same as for all documents I edit.

I.e. a plain-text editor. I think it is pretty sad if we all have to go back to that, surely there must be something more high-tech?

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 12 Jun 2015 13:00 UTC Replying to [comment:10 dag]:

Replying to [comment:6 sjoelund.se]:

Replying to [comment:5 dag]:

But nobody has addressed this question: are there any reasonable editing tools? Those of you that use Sphinx, what editor do you use?

I use geany; same as for all documents I edit.

I.e. a plain-text editor. I think it is pretty sad if we all have to go back to that, surely there must be something more high-tech?

Oh, I can get an html preview inside geany if I want to, just like I can get the output of running make in the same window. I just prefer not to clutter my editing view other messages.

What is it that people are looking for in an editor? Something similar to: https://www.notex.ch to get help editing the files? Or an HTML preview in the editor itself?

modelica-trac-importer commented 6 years ago

Comment by mtiller on 12 Jun 2015 13:13 UTC I have a couple of points to make.

First, the issues we identified this week as being the essential requirements were a) ease of installation and b) having GUIs that would help insert complex markup so the user wouldn't have to remember it. We mentioned WYSIWYG but I asked for specific clarification if WYSIWYG was really the requirement of if it was b). Martin said it was b).

I think we've already addressed a). For b), it would be good if Martin could try out either this online editor: http://rst.ninjs.org/ or the desktop editor ReText (http://sourceforge.net/projects/retext/). I also found a discussion on SO about editors: http://stackoverflow.com/questions/2819832/is-there-an-intelligent-editor-for-rest-files.

We seem to be now be talking about previewing the generated documents. I want to re-iterate, as I did in the meeting, that Sphinx is a document processing system while rST is just a markup format. This is important when it comes to previewing because a true preview requires running Sphinx, not just rendering rST. All the editors I have seen are really about previewing rST, not about previewing the document.

When I was writing my book, I didn't really worry about seeing how every little keystroke was rendered. I focused mainly on the writing part and then reviewed the content section by section. In fact, real-time previewing is a bit impractical. What will you preview? The PDF output? The HTML output? The ePub output? All at once? It makes a lot more sense (again, based on actual practice) to focus on the writing and not the rendering.

All that being said, I agree with Martin that remembering all the markup details is not so easy. I just think we should focus on that aspect and not get bogged down in a discussion about interactive WYSIWYG. I just don't think that is a big deal in practice. As far as remembering markup is concerned, I started off using a cheatsheet:

http://openalea.gforge.inria.fr/doc/openalea/doc/_build/html/source/sphinx/rest_syntax.html

But I pretty quickly had lots of examples to work from in my own text so I didn't even need the cheatsheet for most things. I suspect that will be the case with the specification as well.

But I think we should not continue with these discussions about "can we get something more high-tech?". The requirement should not be that we have high-tech tools. The requirement is that we have useful tools that actually address the "pain" in the process. In my experience, even when I work in an editor that has lots of "buttons" to press to help me out, I generally don't even use them (or want to use them) because if you have to take your fingers off the keyboard and screw around with the mouse, it just slows you down.

We've got an actual toolchain that can be easily installed. Now, I think people should actually try to use what we have and then identify the gaps. I'm not saying what we have is perfect. But let's try to focus on what is actually making this impractical instead of making a wish-list of things that seem like they'd be nice to have but wouldn't actually be used.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 12 Jun 2015 13:51 UTC @mtiller: Regarding a cheat sheet. I do plan to create one such editing guide that contains the markup used in the document so it is all there in one place easy to reference.

modelica-trac-importer commented 6 years ago

Modified by dietmarw on 12 Jun 2015 14:29 UTC

modelica-trac-importer commented 6 years ago

Modified by dietmarw on 12 Jun 2015 14:30 UTC

modelica-trac-importer commented 6 years ago

Comment by mtiller on 12 Jun 2015 15:30 UTC Replying to [comment:13 sjoelund.se]:

@mtiller: Regarding a cheat sheet. I do plan to create one such editing guide that contains the markup used in the document so it is all there in one place easy to reference.

Even better. My point is that, in my experience, this was sufficient for me. All this talk of "high-tech" tools isn't really addressing the fundamental requirement (i.e., not wasting time trying to figure out or learn markup syntax).

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 12 Jun 2015 16:53 UTC I've just updated the installSphinx.bat so it now downloads the Python version from python.org rather than having it in the repo (bad style). It uses Powershell which means it will not work on WindowsXP out of the box. I've also removed the msi file from the git history (to make the clone not unnecessarily large). This means people who want to use it and had a clone of the git repo before should either clone a fresh copy or do

git fetch
git reset --hard orgin/master

Whatever they prefer.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 16 Jun 2015 08:20 UTC I setup https://modelica.readthedocs.org/en/latest/ to compile the document (each new commit triggers a build). It's using mathjax instead of SVG images and it does not use makefiles to compile the document (which could make it tricky to convert image formats). But it required almost no changes or setup.

The download option for the html seems to be "singlehtml", but it has pdf and epub as well.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 18 Jun 2015 14:36 UTC It seems this discussion contains several misunderstandings, and to me several requirements were missing, since it was based on simplified view of the current work-flow. Thus we need to restart this. I'm not saying that current work-flow is ideal, just that we need a better proposal.

This is a common problem. The broad details of a new paradigm attract more attention than the nitty-gritty details – leading to uninformed decisions; this also important for other development – and should be reflected in the MCP-workflow.

First a minor note: I actually didn’t realize that we use the old Word-format (since my view has primarily been about editing the document inside a tool (Word) and looking at the generated document). I assume everyone who has Word has access to a recent version that supports the new formats, and it has important implications (see below). However, I have not switched the document yet.

The current work-flow for updating the specification is (very simplified):

A new version is created
For each issue (all editing is normally done with ‘track changes’ on)
1. A ticket/MCP is created for an issue by anyone
2. Language group decides on a fairly complete text and to add it to the specification
3. An editor adds the text – and any possible cross-references, and also adds it to the list of changes for the current revision – referencing the section (if multiple ones just the “primary” one) and ticket. And verifies that it looks good.
4. The updated version is committed with reference to ticket number on svn. Sometimes this is done in batches, since waiting a few seconds between each change is cumbersome.
A Modelica meeting reviews the complete document, and accepts the changes one-by-one.

I tried to perform this for a number of tickets (except step 3 (of course) – and with considerable help in steps 2.1 and 2.2 :-)); just to ensure that I didn’t miss anything. However, due to web-server problems I couldn’t do as much as planned.

Thus the following are also essential for the work-flow (there are also a number of nice-to-have like easy-to-use):

Simple way of making cross-references (using a graphical user-interface). If every section had a good stable reference for deep-linking it would be simpler; but finding good stable names for all existing sections will be quite time-consuming (numbers are clearly a bad idea, in case we add a new section, but section-heading as used currently in the proposal is not stable enough for me – and you still need an integrated tool for referencing them etc).
Short turn-around time when updating the document (less than a second ideally; i.e. even faster than current process)
Integrated checks for grammar and spelling
Automatic handling of images (not only equations) – the simplest solution is to just have one file with integrated images as source. One file also makes it more convenient to move text-sections between different sections.
Final review of changes:
- There could be some way of reviewing multiple changes in a straightforward way. A problem with the current work-flow is that if we reject a change in the word-document the ticket is not automatically re-opened (note that this is very rare – but has happened). A problem with having them as request that should be merged is that there are often multiple issues in the same paragraph – thus the merging may introduce errors.
- Alternative we could more regularly do final updates of the document. Regular updates reduce the risk of merge-conflict, but both require a way of regularly polling for proposed changes; and that people regularly actually review the changes. This possible change of the work-flow would need to be evaluated and discussed more.

There are a number of improvements that are straightforward for the specification work:

We shouldn’t create new directories for each version
Don’t use binary files in a version control system – especially not zip-files (unless the system can handle them).

To clear up some misunderstandings etc:

Contrary to some claims: LaTeX, Word, Html (and probably many other formats) primarily have a separation between design (style-sheet, word templates etc) and the actual document.
Tools are not file-formats. It’s a many-to-many mapping: e.g. older Word-files are ".doc" and newer Word-files can either be ".docx" or ".xml" – the text is technically traceable in the second format (but I don’t know if the other 90% could be reduced – if so it might be a solution, otherwise not). The new formats are just two different ways of storing the same contents; there does not seem to be any conversion (this is different from Word editing rtf or html). There are also supposedly other tools handling those file-formats.
There are a number of options for generating ePub from Word (until it becomes standard – similarly as pdf is now), e.g.: http://manual.calibre-ebook.com/faq.html#what-formats-does-app-support-conversion-to-from - taking a page-oriented document and making it into free-flowing one is not that hard; I remember doing it for man-pages 20 years ago. (And calibre can reluctantly do it for pdf; it seems to have solved the issues I found when I tried another tool from the generated pdf.)

Thinking more it seems that some xml-based storing of the document would be ideal:

Text-based:
- Allowing both a GUI for editing the “source” (with syntax highlighter, tab-completion etc) and for “wysiwyg”.
- Traceability
- Small diffs in version control system
Clear separation of contents and meta-information: you don’t have to guess about meta-information; and there are existing tool-chains for extracting parts
Allows arbitrary data (for images etc) Now we just need to find a good one.

The default xml-format in Word seems too verbose (but lacking line-breaks!), and even if an "Open Standard" there are not that many tools supporting it (some claim that google-docs and some others do – I haven’t checked). It might be that I am mistaken – or that it would be possible to clean up the document to reduce the Word-overhead. However, it sort of satisfies the requirements – it is just "too much".

E.g., I tried doing some simple changes and a naïve text-diff doesn’t work (due to the lack of line-breaks). Replacing space by line-break using sed 's/ /\n/g' on the xml-files produces other xml-files that seem identical when opened in Word and actually have a diff substantially smaller than the document. Obviously that processing should be done properly (is there a setting inside Word??) – or at least verified to be correct according to the XML-spec. I don’t know exactly what svn can do regarding this (and even if it matters); I'm aware that svn can handle diffing etc for Word-files.

On the other hand several of the requirements are not met by Sphinx and restructuredText, and to me it simple is a step backward in terms of usability. I'm not saying that it cannot be used; just that it doesn't seem like progress.

Furthermore, the ambiguities of markdown seem to introduce errors: e.g. in the online editor (http://rst.ninjs.org/) the following looks good: terminated by a single quote, e.g. '12H', In the generated page it says: terminated by a single quote, e.g. ‘12H’, The latter (using matching single quotes) is clearly wrong, since Modelica only has one single-quote symbol. The equation handling also seems broken in the generated document – part of the image is missing.

BTW: The previously proposed open document format ("odt") seems far from the solution as well.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 18 Jun 2015 16:03 UTC Replying to [comment:19 hansolsson]:

Thus the following are also essential for the work-flow (there are also a number of nice-to-have like easy-to-use):

Assuming you want 100% of the old work-flow. Which I thought we did not.

* Simple way of making cross-references (using a graphical user-interface). If every section had a good stable reference for deep-linking it would be simpler; but finding good stable names for all existing sections will be quite time-consuming (numbers are clearly a bad idea, in case we add a new section, but section-heading as used currently in the proposal is not stable enough for me – and you still need an integrated tool for referencing them etc).

I think this is more of a nice-to-have. If you want good permanent labels, you need to create the names yourself.

* Short turn-around time when updating the document (less than a second ideally; i.e. even faster than current process)

I get 0.6s to re-compile after editing lexical.rst...

* Integrated checks for grammar and spelling

I know of no tool that does good grammar checking in Linux. All text editing tools I know of do spell checking though.

* Automatic handling of images (not only equations) – the simplest solution is to just have one file with integrated images as source. One file also makes it more convenient to move text-sections between different sections.

That's horrible on so many levels. There is a good reason why the first thing I did was split the document into multiple files.

And as soon as you have images (binary files) embedded in the textual source file? Version control software will scream. Not to mention you want multiple representations of the images (pdf/svg/png) for the different targets.

* Final review of changes: * There could be some way of reviewing multiple changes in a straightforward way. A problem with the current work-flow is that if we reject a change in the word-document the ticket is not automatically re-opened (note that this is very rare – but has happened). A problem with having them as request that should be merged is that there are often multiple issues in the same paragraph – thus the merging may introduce errors. * Alternative we could more regularly do final updates of the document. Regular updates reduce the risk of merge-conflict, but both require a way of regularly polling for proposed changes; and that people regularly actually review the changes. This possible change of the work-flow would need to be evaluated and discussed more.

There are a number of improvements that are straightforward for the specification work: * We shouldn’t create new directories for each version * Don’t use binary files in a version control system – especially not zip-files (unless the system can handle them).

To clear up some misunderstandings etc: * Contrary to some claims: LaTeX, Word, Html (and probably many other formats) primarily have a separation between design (style-sheet, word templates etc) and the actual document.

Word documents quite often seem to have paragraphs in different fonts than other paragraphs just because someone copy-pasted text and did not re-apply the default style...

* Tools are not file-formats. It’s a many-to-many mapping: e.g. older Word-files are ".doc" and newer Word-files can either be ".docx" or ".xml" – the text is technically traceable in the second format (but I don’t know if the other 90% could be reduced – if so it might be a solution, otherwise not). The new formats are just two different ways of storing the same contents; there does not seem to be any conversion (this is different from Word editing rtf or html). There are also supposedly other tools handling those file-formats. * There are a number of options for generating ePub from Word (until it becomes standard – similarly as pdf is now), e.g.: http://manual.calibre-ebook.com/faq.html#what-formats-does-app-support-conversion-to-from - taking a page-oriented document and making it into free-flowing one is not that hard; I remember doing it for man-pages 20 years ago. (And calibre can reluctantly do it for pdf; it seems to have solved the issues I found when I tried another tool from the generated pdf.)

A better question is does it create nice HTML documents that people actually want to read?

Thinking more it seems that some xml-based storing of the document would be ideal: * Text-based: * Allowing both a GUI for editing the “source” (with syntax highlighter, tab-completion etc) and for “wysiwyg”. * Traceability

For a subset of XML. If there is no canonical representation that you commit, it all goes to hell.

* Small diffs in version control system

See above. Note that manually editing XML-files is painful (which is why the ASCIIdoc representation exists; but I did not find any satisfactory toolchains for either docbook or asciidoc).

* Clear separation of contents and meta-information: you don’t have to guess about meta-information; and there are existing tool-chains for extracting parts * Allows arbitrary data (for images etc) Now we just need to find a good one.

I thought we did.

The default xml-format in Word seems too verbose (but lacking line-breaks!), and even if an "Open Standard" there are not that many tools supporting it (some claim that google-docs and some others do – I haven’t checked). It might be that I am mistaken – or that it would be possible to clean up the document to reduce the Word-overhead. However, it sort of satisfies the requirements – it is just "too much".

E.g., I tried doing some simple changes and a naïve text-diff doesn’t work (due to the lack of line-breaks). Replacing space by line-break using sed 's/ /\n/g' on the xml-files produces other xml-files that seem identical when opened in Word and actually have a diff substantially smaller than the document. Obviously that processing should be done properly (is there a setting inside Word??) – or at least verified to be correct according to the XML-spec. I don’t know exactly what svn can do regarding this (and even if it matters); I'm aware that svn can handle diffing etc for Word-files.

svn cannot diff Word-files. Tortoise-svn calls Word to do diffs to account for svn's complete disregard for diffing such files.

On the other hand several of the requirements are not met by Sphinx and restructuredText, and to me it simple is a step backward in terms of usability. I'm not saying that it cannot be used; just that it doesn't seem like progress.

I'm pretty sure it handles the requirements. It's also very easy to extend the toolchain to do whatever you need.

Furthermore, the ambiguities of markdown seem to introduce errors: e.g. in the online editor (http://rst.ninjs.org/) the following looks good: terminated by a single quote, e.g. '12H', In the generated page it says: terminated by a single quote, e.g. ‘12H’, The latter (using matching single quotes) is clearly wrong, since Modelica only has one single-quote symbol. The equation handling also seems broken in the generated document – part of the image is missing.

You can disable this for sphinx if you like. Or render it as code ('12H') to make it more clear.

BTW: The previously proposed open document format ("odt") seems far from the solution as well.

It's the same thing as Word.

modelica-trac-importer commented 6 years ago

Comment by choeger on 18 Jun 2015 16:05 UTC I disagree with the requirements in the following points:

Cross references do not require a graphical user interface
Turn-around time does hardly matter as long as it is not hours, since hardware gets faster every year
Checks for grammar and spelling are necessary, but not integrated, but as separate, automatic processes
(Bitmap or Vector) images should exist as separate files so they can be version controlled and re-used
The selected format should be mergable-by-default
Version control should be done with state-of-the-art tools and not some half-baked, propietary solution of one particular editor.
Merging, discussions and forks should be dealt with as close as possible to the version control system and as far as possible from the editor

Regarding the XML proposal: It is folklore that XML did not met all its goals, since it is hardly human-readable/writable. This is precisely the reason why things like markdown exist. Second, there should be no style informations at all in the specification, only semantic informations that can be given a certain look depending on the backend. Word-XML almost certainly does not meet this requirement.

Conclusion: We are talking about collaboratively maintaining a large piece of structured text. We are not in the business of typesetting. All this talk about graphical user interfaces just obscures the fact that Word is not a collaborative tool and is also not intended for structured text but for fast typesetting. Writing Word documents instead of clearly structured text is a decision against collaboration.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 18 Jun 2015 17:45 UTC Hans, just a small demonstration how not useful the traceability with the Word document is. You've just committed three fixes with rather non-descriptive commit messages. OK that can happen in a rush but now the problem is one has to jump through hoops in order to find out what are the changes done.

The log only says:

 ------------------------------------------------------------------------
r8302 | hansolsson | 2015-06-18 17:01:53 +0200 (to., 18 juni 2015) | 2 lines
'Resolve

----

r8301 | hansolsson | 2015-06-18 16:43:41 +0200 (to., 18 juni 2015) | 2 lines
'Resolve

----

r8300 | hansolsson | 2015-06-18 16:43:05 +0200 (to., 18 juni 2015) | 2 lines
'New

If the spec would have been in rst format it would have been trivial to see with a click of a mouse what the changes are (even highlighted in rendered fashion) as we demonstrated.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 22 Jun 2015 08:49 UTC Replying to [comment:20 sjoelund.se]:

Replying to [comment:19 hansolsson]:

Thus the following are also essential for the work-flow (there are also a number of nice-to-have like easy-to-use):

Assuming you want 100% of the old work-flow. Which I thought we did not.

The topic on the agenda was to update the process; and we should start by actually discussing that. I thought it good to actually describe the current one.

In general I would assume we proceed as follows for updating the process:

Investigate the current process.
See which parts are essential, and which parts are missing
Try to find an updated process
See what tools and file-formats can be used in the updated process.

Now it seems we are changing the file-format, and shoe-horning the process into the new file-format; without even fully knowing the current process. That cannot be a good process.

* Simple way of making cross-references (using a graphical user-interface). If every section had a good stable reference for deep-linking it would be simpler; but finding good stable names for all existing sections will be quite time-consuming (numbers are clearly a bad idea, in case we add a new section, but section-heading as used currently in the proposal is not stable enough for me – and you still need an integrated tool for referencing them etc).

I think this is more of a nice-to-have. If you want good permanent labels, you need to create the names yourself.

I view "simple way of making cross-references" as more than a nice-to-have.

Added clarification: The GUI-requirement was to have a suitable GUI for the key operations; this one should be included.

However, permanent labels may not be required: The requirement is that there is a simple process for using the labels (e.g. clicking on section-headings in the table-of-contents), and that the labels are automatically kept consistent.

Whether the labels are permanent or the tool will update them automatically doesn't matter. I assume Word uses the second approach; and I know that Dymola (and hopefully other Modelica-tools) uses something similar to the second approach for referencing classes in Modelica.

* Short turn-around time when updating the document (less than a second ideally; i.e. even faster than current process)

I get 0.6s to re-compile after editing lexical.rst...

Good.

* Integrated checks for grammar and spelling

I know of no tool that does good grammar checking in Linux. All text editing tools I know of do spell checking though.

That's a pity. Are there any non-Linux tools that do it?

* Automatic handling of images (not only equations) – the simplest solution is to just have one file with integrated images as source. One file also makes it more convenient to move text-sections between different sections.

That's horrible on so many levels. There is a good reason why the first thing I did was split the document into multiple files.

And as soon as you have images (binary files) embedded in the textual source file? Version control software will scream. Not to mention you want multiple representations of the images (pdf/svg/png) for the different targets.

I agree that it will be horrible if we want to track changes in the images (but I don't know any good tools for that).

For other cases (and assuming the "binary" representation is stable) it shouldn't be a problem: since the "diff" part of the version-control system will see a large identical block.

To clear up some misunderstandings etc: * Contrary to some claims: LaTeX, Word, Html (and probably many other formats) primarily have a separation between design (style-sheet, word templates etc) and the actual document.

Word documents quite often seem to have paragraphs in different fonts than other paragraphs just because someone copy-pasted text and did not re-apply the default style...

Yes. Word has a number of problems.

Many of them can be worked around by disabling some options (automatic language detection) - or doing things differently (e.g. "Paste as Unformatted text").

Some of the other issues I don't know good ways of working around; it could be that someone else knows how - or that it cannot be done.

I'm not saying that Word is perfect, or that we necessarily have to use it - only that several of the features are actually useful, and that switching to another tool without them would be a step backward.

* Tools are not file-formats. It’s a many-to-many mapping: e.g. older Word-files are ".doc" and newer Word-files can either be ".docx" or ".xml" – the text is technically traceable in the second format (but I don’t know if the other 90% could be reduced – if so it might be a solution, otherwise not). The new formats are just two different ways of storing the same contents; there does not seem to be any conversion (this is different from Word editing rtf or html). There are also supposedly other tools handling those file-formats. * There are a number of options for generating ePub from Word (until it becomes standard – similarly as pdf is now), e.g.: http://manual.calibre-ebook.com/faq.html#what-formats-does-app-support-conversion-to-from - taking a page-oriented document and making it into free-flowing one is not that hard; I remember doing it for man-pages 20 years ago. (And calibre can reluctantly do it for pdf; it seems to have solved the issues I found when I tried another tool from the generated pdf.)

A better question is does it create nice HTML documents that people actually want to read?

I primarily looked at generating ePub; I would have to investigate html more.

* Clear separation of contents and meta-information: you don’t have to guess about meta-information; and there are existing tool-chains for extracting parts * Allows arbitrary data (for images etc) Now we just need to find a good one.

I thought we did.

I disagree, since I don't think we had understood the process we want to support.

svn cannot diff Word-files. Tortoise-svn calls Word to do diffs to account for svn's complete disregard for diffing such files.

I assumed that, but I was also interesting in knowing whether svn-diff is line or character-oriented - since it matters for Word-xml-files.

Furthermore, the ambiguities of markdown seem to introduce errors: e.g. in the online editor (http://rst.ninjs.org/) the following looks good: terminated by a single quote, e.g. '12H', In the generated page it says: terminated by a single quote, e.g. ‘12H’, The latter (using matching single quotes) is clearly wrong, since Modelica only has one single-quote symbol. The equation handling also seems broken in the generated document – part of the image is missing.

You can disable this for sphinx if you like. Or render it as code ('12H') to make it more clear.

Yes, but the point is that if the alleged wywsiwyg-tool renders it the wrong way we will not see that it is necessary.

That's a severe problem for me, and wysiwyg (even if not 100% perfect, e.g. different ligatures, or orphan-handling doesn't matter) avoids those surprises.

Word also has the some annoying habit of changing quotes - I don't like that either. However, with Word you see that it has happened, undo fixes it, and it can be turned off.

Replying to [comment:22 dietmarw]:

Hans, just a small demonstration how not useful the traceability with the Word document is. You've just committed three fixes with rather non-descriptive commit messages.

Sorry, that was due to command-line svn skipping everything in the message after the first word. I do recall that I used messages without space in the past, guess that explains it.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 26 Jun 2015 10:18 UTC The attached ePub was constructed by simply doing (on a working copy of 3.4):

Save as docx from Word
Convert to ePub in Calibre; no special settings.

Nothing else (and it is also possible to convert to/from a number of other file-formats, including some markdown variant).

I (and others) found it easy to read - but it is not perfect.

In particular if we want to support small screens (iPhone as an extreme) we need to rethink some parts:

Tables with lots of texts does not work. There is an option for generating Inline-tables in Calibre - the results are mixed.
Modelica code (in particular with end-of-line comments) is hard to read. Note that it is important to have the code visually integrated with the rest of the text (since the rest of the text often refers to the code samples).

Note that these issues are not related to file-formats and tools, but simple due to the fact that the screen is not wide enough.

I believe we can rethink this - independent of the file-format and tools with the aim of making the specification easier to use:

Replace tables for der etc with sub-sub-sections. This makes it easier to refer to them, and include examples. I believe that will increase the quality of those texts - since the texts are no longer as narrow. The document may become slightly longer (doubtful) - but I would view it as an improvement.

Replace the end-of-line comments in Modelica-code-examples with comments after the examples. This allows the comments to be more verbose and easier to read (variable-width font). If we are concerned that people will copy the examples and try to run them we could add one description/comment saying "Invalid code" or "Bad Style" when appropriate. Alternatives would be to use traditional C-comments, or description strings. That might be useful in some cases - the alternative of just relying on the formatting does not seem good to me.

(There was also an issue with some images, whereas others worked, I don't know why.)

BTW: I was surprised by how readable the html in the ePub-file was; could be that I compared to what Word normally generates.

modelica-trac-importer commented 6 years ago

Comment by otter on 29 Jun 2015 08:34 UTC

At the last Modelica Design Meeting, about 13 requirements for a new document format have been collected. Find below a discussion about what I see as the three most important ones. One has also to take into account that no perfect text processing system exists, so one always has to accept significant drawbacks, independently which solution is selected. In a separate comment, I will list other variants for a possible new document format.

Key Requirements

1. Good HTML, no pdf

The Modelica Specification is currently provided in pdf format. It is awkward to inspect it. I have the feeling it is possible to get large agreement by the MA members to only support the specification in html format in the future in two ways: (a) Directly on the web page, with sub-webpages, (b) all web-pages zipped in one file, so that people can download the zip-file, unpack and inspect the specification locally with a web browser. I do not think that pdf is anymore needed. If we explicitely abondon pdf format, we have more options, less trouble and less work.

Martin (S.) made already a very nice html version for a subset of the specification for demonstration:

https://dev.openmodelica.org/~marsj/MLS-rst/html/

The essential thing missing for me is just that chapter/section numbers should also be included (in order that it is possible to say: "this is discussed in chapter 12", instead of "this is discussed in ").

Another good example is of course Mikes Modelica book: http://book.xogeny.com/

Note, that many of the new programming languages from the last years, do it also in this way (no pdf file anymore). Here are some examples:

Go (from Google): http://tour.golang.org/welcome/1 (with online editing and execution)
Rust (from Mozilla): http://rustbyexample.com/hello.html (with online editing and execution)
Dart (from Google): https://www.dartlang.org/docs/
TypeScript (from Microsoft): http://www.typescriptlang.org/Handbook

So, lets try first to get agreement to use only html for the official Modelica Language Specification in the future.

Best would be to potentially use all new (nice web) features, so HTML5, CSS3, Javascript, svg, webgl

2. What-You-See-Is-What-You-Get editing

Here there are different opinions and it seems not possible to get agreement, and we need somehow to find a compromize. There are a set of people (and I belong to it) that find this very important. Especially two features are needed:

Show available commands If a document processing system is not used in daily work, it is too time consuming to figure out which commands/markups to use (e.g. how to mark a text as "bold", or what is the escape character etc.). If a graphical user interface with the available commands/markups is available, it becomes at least reasonable to work in this case. Most likely, someone using a, say, markup language in his/her daily work is much faster by direct plain text editing. However, by sure, someone who is not using this in his/her daily work is much slower.
Interactive rendering (to find and fix errors quickly) When working with plain text with markups, one always have to first process the plain text and inspect it again to find potential errors (e.g. "blank" at the wrong place, wrong/missing escape character, error in markup key word, figure not at the place where it is expected, table rows/columns not as expected, etc.). This means one has to read the text at least always twice (in plain text and rendered version) and once an error is detected, go back and trying to fix the error and doing the whole process again. With WYSIWYG editors (even if not the exact final rendering is presented) one has immediate feedback and can find and fix errors at the place where one is typing. So, this is much more efficient.

3. Collaboration

The goal is to use a version control system also for the Modelica Language specification for all parts (not just for the whole document). The goal is to have more collaborative work so that people can propose changes or fixes to the specification and with one click this proposal is included in the actual document. Also the current feature should be supported, that the precise difference between the actual and the previous version of the specification is displayed in a reasonable way in the renderer (such as in previous versions where two types of pdfs have been provided, one with and one without change-marks).

This is most likely the part where most compromizes have to be excepted: The available features depend heavily on the underlying tools and to my understanding there is no tool available that fulfills all needs.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 29 Jun 2015 09:10 UTC Replying to [comment:25 otter]:

The essential thing missing for me is just that chapter/section numbers should also be included (in order that it is possible to say: "this is discussed in chapter 12", instead of "this is discussed in ").

You can get this, and it is quite simple (https://modelica.readthedocs.org/en/numbered/):

diff --git a/source/index.rst b/source/index.rst
index b8b3547..b7cd8b5 100644
--- a/source/index.rst
+++ b/source/index.rst
@@ -10,6 +10,7 @@ Contents:

 .. toctree::
   :maxdepth: 2
+  :numbered:

   MCP
   introduction

However, the automatically generated references to the section seem to only use the name (can get numeric references only to tables, listings, figures).

If the entire document is in html, it is possible to reformulate such references, like:

It is also possible to define functions and call them in a normal fashion. The function call syntax for both positional and named arguments is described in Section 12.4.1 and for vectorized calls in Section 12.4.4.

To something like:

It is also possible to define functions and call them in a normal fashion, using :ref:both positional and named arguments <section-funcall-pos-named-arguments> and using :ref:vectorized calls <section-funcall-vectorized>.

I must say all those numbers mostly clutter the output though.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 29 Jun 2015 09:14 UTC https://github.com/sphinx-doc/sphinx/issues/326 seems to have a way to customize referencing section numbers. I will try it out.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 29 Jun 2015 09:17 UTC Never mind. I just read the end of the comment. The functionality is still missing. Probably possible to make something custom working for the pdf output, but I am unsure about the html.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 29 Jun 2015 10:16 UTC Replying to [comment:25 otter]:

Just some interesting information on the examples sites you listed which provide HTML as documentation do not use HTML as source format:

* Go (from Google): http://tour.golang.org/welcome/1 (with online editing and execution)

Source format: Markdown (see https://github.com/golang/tour)

* Rust (from Mozilla): http://rustbyexample.com/hello.html (with online editing and execution)

Source format: Markdown (see https://github.com/rust-lang/rust-by-example)

* Dart (from Google): https://www.dartlang.org/docs/

Source format: Markdown (see https://github.com/dart-lang/www.dartlang.org/tree/master/src/site/docs)

* TypeScript (from Microsoft): http://www.typescriptlang.org/Handbook

Source format: Markdown (see https://github.com/Microsoft/TypeScript-Handbook)

So as you can see none of these site use HTML as source format but as generated display format. And that for a good reason. Please feel free to look for other modern programming languages using HTML files as source, I'm pretty sure you won't find many if not even any. And we should not fall into the trap of trying to do things better by using HTML that everybody else is NOT but learn from others. I.e., use mark-up languages. Now especially Michael Tiller and then again Martin Sjölund have invested a lot of time and thought into what Markup language is powerful enough.

Just some extra insight. People who really know the web like Douglas Crockford just recently also talked at an Angular JS conference and also named that HTML is a "terrible language for technical documents and which is why Markdown was invented". https://www.youtube.com/watch?v=6UTWAEJlhww&feature=youtu.be&t=13m17s

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 29 Jun 2015 11:52 UTC Replying to [comment:29 dietmarw]:

Replying to [comment:25 otter]:

* TypeScript (from Microsoft): http://www.typescriptlang.org/Handbook Source format: Markdown (see https://github.com/Microsoft/TypeScript-Handbook)

So as you can see none of these site use HTML as source format but as generated display format.

But the above links are all tutorials/handbooks, not specifications.

Side-track about tutorial: Originally we had a tutorial for Modelica as well, but it was too much effort to maintain it (and an obsolete tutorial is bad) - especially now that people externally make introductory books about Modelica. I don't know if Modelica Assocation now have the resources for making a tutorial on Modelica.

The last one in the list (TypeScript) links to a specification: https://github.com/Microsoft/TypeScript/blob/master/doc/spec.md It looks very similar to our specification - and it is markdown.

But looking at the raw markdown, I would say that one of the following holds:

The markdown is generated from some other format.
There exists a smart editing tool for references in that markdown variant.
The editing procedure for that document is really bad.

Then looking at the file-history it seems they use a word2md package to generate it (specially designed for their document).

I don't know about the specifications of the other languages listed.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 29 Jun 2015 12:36 UTC Hans, the point I was simply trying to make is that HTML is not a suitable source format. Neither for Tutorial nor for the specs themselves. I could start listing spec documents using markup languages but that does not help the issue here. Also I was not advocating for Markdown but merrily showing that all that HTML documentation for all those projects that Martin O. listed is just generated from a markup languages (coincidentally Markdown). We've already established that Markdown is not powerful enough. Hence reStructured text is for now the best candidate (based on actual experience, mind).

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 29 Jun 2015 13:56 UTC In order to get back to the requirements as listed during the last 86th Design Meeting, I've created a Feature Matrix as Google Spread Sheet. I've activated commenting for everyone. Editing is for now restricted.

modelica-trac-importer commented 6 years ago

Comment by otter on 29 Jun 2015 14:26 UTC

Documentation with Sphinx and github

Based on the three 3 basic requirements in comment:25, lets analysis the proposal to use Sphinx and github.

1. Good HTML

As demonstrated by Martin (S.), https://dev.openmodelica.org/~marsj/MLS-rst/html/, nice HTML is generated.

What needs to be clarified (but most likely not a big issue): Is it possible to include chapter/section numbers with reasonable effort (in order that it is possible to say: "this is discussed in chapter 12", instead of "this is discussed in ").

2. WYSIWYG Editor'

On the web there are many requests for a Sphinx WYSIWYG Editor, but not many answers.

I first inspected the link that Mike provided where Sphinx is described in a short way:

http://openalea.gforge.inria.fr/doc/openalea/doc/_build/html/source/sphinx/rest_syntax.html

Obviously, you can do quite a lot of text formatting with re-structuredText, and also add formatting options. E.g. here is an example how to add a figure with a partictular width, height, alignment etc:

.. figure:: ../images/wiki_logo_openalea.png
    :width: 200px
    :align: center
    :height: 100px
    :alt: alternate text
    :figclass: align-center

Blanks and blank lines are essential and may change the rendering.

On first view, one critical part of re-structuredText are tables: It seems time-consuming to generate or change a table (adding one character in one cell, seem to require to make changes in all other rows).

As anticipated, it is just not doable to work with Sphinx and re-structuredText just from time to time. There are too many special rules and markups. It is different, if you work with it in your daily work, or write a book like Mike did it. From a user point of view, there is no principal difference to LaTex, just that LaTex has more commands to remember, and that re-structuredText uses standard characters (like "*" or "." or blanks) as mark-ups (and therefore one can never be sure that a plain text pasted in to re-ST file will be rendered as one expects it; e.g. if a URL has a blank, one has to escape this blank).

There seems to be no WYSIWYG editor for Sphinx, but only some for re-structuredText (as Mike already pointed out). This means, it is not possible to see the whole specification in one version that can be edited. One can probably live with this restriction that one can only edit one file with a WYSIWYG editor, and that the definition of the relationship between interconnected documents (corresponding to table-of-contents) has to be defined manually in a textual form.

I tried some WYSIWYG editors:

noTex https://www.notex.ch: Online editor in browser. Could input plain text, but then recognized that I have to first select reStructured format, but it never allows mit to generate a new document. The major drawback is that the document is somewhere loaded in the "cloud" and edited in the "cloud".
ninjs http://rst.ninjs.org: Online editor. It seems one can paste (reStructued) text in the left window and gets the rendering in the right window. Good that simple markups can be selected in a menu. However, at least headings, tables, equations, images are missing (one can include an image only via an URL). Seems to be not practical (just for quickly experimenting the first time). It is also not clear how to store the defined text locally in a file (maybe not possible and the text is again stored in the cloud somewhere).
ReText http://sourceforge.net/projects/retext: This is a Python package and requires Python 3 (a non-backwards compatible version of Python; on my computer only Python 2 is stored). I tried to install it, but gave up after some time, because non-trivial (at least for me), and seemed to take too much time to install it. From the scare description "ReText is a simple editor that reads your text with MarkDown or HTML markup and saves it as plain text, HTML or PDF.", I am also not so motivated to install it, because not even re-structuredText is mentioned, and someone who is proud of his/her editor would make more effort to tell people what nice program he/she implemented. One text string as documentation and summary, is not a good sign.
ReST Editor http://marketplace.eclipse.org/content/rest-editor: Eclipse plug-in. Installed and tried it. But only syntax highlighting (no help in editing, or redendering)
On github there are about 16 "reStructuredText editors" listed. Besides the ones above, all the others seem to be very limited and/or in a very early stage or dead.

To summarize, there seems to be no useful editor for reStructuredText available, so this requirement is not fulfilled.

3. Collaboration

I did not evaluate myself yet (not enough time). As I understand from previous demonstrations and from https://gist.github.com/dupuy/1855764) re-structuredText is directly supported by github. Therefore, when working with github one gets support for rendering of diff's of re-structuredText. So, it seems that collaborating via github on some fixes or smaller extensions to the specification would be much better as of today.

However, the github way of showing differences is not suitable if the difference between two specification versions shall be shown. The question is whether there is a tool that can show the difference of two Sphinx projects and render the result in html. This is needed in order that a user can quickly understand all the differences between two specification versions.

4. Summary

The proposal to use Sphinx in the future for the Modelica Specification is no option for me, because it would just mean that I (and probably others) can no longer contribute to the Modelica specification.

There might be slight variants that make it more practical: If Modelica 3.3 would be converted to Sphinx and minor fixes and minor improvements are made to the files, then it might be fine (also for me) to just edit the existing restructuredText files. If larger changes are made, one could still work initially with Word, for this sub-section, or new chapter. Once agreement is reached, someone could convert it to re-structuredText and include it in the restructuredText version (so just before the specification is released).

modelica-trac-importer commented 6 years ago

Comment by otter on 29 Jun 2015 14:32 UTC

Documentation with Word/OpenOffice/LibreOffice/GoogleDocs + HTML generation + svn

Lets first sketch the proposal and then analyse it:

The best WYSIWYG editors are the ones for office applications. So, one could use one of them as "mother" version and continue to edit and work as of today (one could evaluate whether another tool like OpenOffice, LibreOffice, GoogleDocs gets more acceptance).
For every new version of the specification, a pure html version of the specification is constructed. This is not fully automatic and requires manual work (examples are the transformation that Martin (S.) did via Sphinx, or what Hans did via Calibre/epub). Since this is done only once every 2 or 3 years, this seems to be acceptable. If an epub is generated (via Calibre) one additional step is needed: Epub is not good for a web page. However, the epub is just a zip-file containing standard html and image files. The essential point is that an additional file is present that defines the structure of the document. This file needs to be processed by an own script to generate a "table-of-contents" file which in turn has links to the different files and sections. One needs to evaluate whether the approach Word -> html -> Sphinx -> html is a more automated (better) one.

Analysis:

1. Good HTML

Fulfilled (similar to the Sphinx proposal)

2. WYSIWYG Editor

Best editing facilities (from all proposals)

Drawbacks

Word is not free and not all Modelica members have it.
Tried OpenOffice 4.1.1: When opening the Modelica 3.3 rev. 1 version, I did not see any formatting errors. Looked exactly as in Word. Converted to html in OpenOffice -> Very bad html was generated. Loaded in Calibre and generated epub: Section titles have been removed and html not so nice. Word output is much better.
Tried GoogleDocs: Seems to be nice, simple user interface with nice collaboration on the same document (people can edit it in parallel and revisions and who changed what can be seen). But: the GoogleDocs document cannot be saved locally on file, only various export documents are provided (html, docx). Local editing is only possible in Chrome (because Chrome automatically synchronizes with GoogleDocs when online again). By default, no section numbers. There is a third-party plugin to extend GoogleDocs with section numbers.

3. Collaboration

github seem to not support docx format specially. With Tortoise svn, a "diff" calls the local Word version and compares the version in the repository with the local version and shows the changes. This is not practical for single commits. One could reduce the issue a bit, by splitting the specification in several files (one file per Chapter). A diff showing the changes would then become a bit more practical (but not as nice as with the Sphinx/github solution). On the other hand, showing the difference between two versions is possible and nice.

The collaboration in GoogleDocs seems to be very nice: Several people can work at the same time at the same document. Different revisions from different people are nicely shown. People may have only "view" rights. However, they can propose a change, and the "author" can accept this change. All this is done in a purely graphical way. The drawback is that GoogleDocs cannot handle large documents. One would have to split the specification in several files. However, GoogleDocs seem to have no features to state that several files together build one document.

Summary

Using Word docx format as mother version and then for every released version a semi-automatic html would improving the current documentation process. However, collaboration would not be improved.

modelica-trac-importer commented 6 years ago

Comment by otter on 29 Jun 2015 14:40 UTC

Documentation with epub3 + website generation + github

Lets first sketch the proposal and then analyse it:

The largest community that bases its content creation on html is Epub. The newest version, epub3, supports additionally svg, MathML and Javascript. Epub is interesting because a lot of development work is being done. Since there is a huge commercial interest, good tools are available (for Epub2) or are becoming available (for Epub3). A reasonable open source epub2/3 editor is Calibre (http://calibre-ebook.com/download_portable). It requires to input xhtml (so a restricted form of html), but provides most of the commands via a toolbar and shows the rendering result at once in a second window. So, not as good as Word, but much better as Sphinx. There are also commercial editors that have a true WYSIWYG editing facility (e.g. BlueGriffon, http://www.bluegriffon-epubedition.com/BGEE.html: it works similarly as a Modelica graphical tool by having a layer where the definition is as in Word graphically, and a textual layer, and the user can switch between the two, so combining both Worlds; it costs 245 Euro; there are also other commercial tools, this was just one example).

With epub, all layout options must be defined in CSS style files. From a user point of view, it is similar as Sphinx/restructuredText, because there is a markup language (xhtml) to define the structure of the document and a separate layout definition in CSS files. The big advantage is that the community is much larger as for Sphinx, and the commercial interest is also order of magnitudes larger (which means that money is invested in tools).

So, the proposal is to use Epub3 as "mother" version of the document, because (a) reasonable editors are already available and much more can be expected for the future and (b) Modelica tools support more and more directly html5 technology (e.g see latest Dymola release that can export plots, object diagrams, 3D animations in html5/svg/webgl).

However, epub is not suited for a web page, which is the main purpose for the Modelica specification. Starting from epub, we need to build some script to generate a web page from it. This requires some work, but should not be so difficult: Epub is basically just a collection of (standard) html files collected in a zip-file. Furthermore, there is one special file that defines how the chapters belong together and how the navigation should be. This special file needs to be parsed and a Table-of-contents html file generated that links to the individual html files. In order that one can navigate also forth, back and up from the individual html files, one need to add some Javascript code in these files, that do this work.

I searched on the web, whether there are tools for this last transformation (epub -> website), but did not find something.

1. Good HTML

Fulfilled.

2. WYSIWYG Editor

Reasonable open source editors, good commercial editors. Not as good as Word, but much better as Sphinx. Contrary to Word, (standardized) vector graphics (svg), vector animation (webgl), video and audio can be included.

3. Collaboration

Not fully clear. In principal, every Epub editor unzips the epub file in a local (tool-specific) directory and then operates in this directory. If this cached directory is versioned on github, one could make commits on the epub individual text files (so html code). This is much better as Word, but not as good as restructuredText where the rendering is supported on github. Furthermore, this is only an assumption that this is possible. One needs to evaluate it, whether this works in practice.

Contrary to Word, it seems not possible make the difference of two versions of the specification and show this difference in rendered form.

Summary

Needs more evaluation. It could be that this is the best option so far (but not fully clear yet).

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 29 Jun 2015 14:42 UTC Replying to [comment:34 otter]:

For every new version of the specification, a pure html version of the specification is constructed. This is not fully automatic and requires manual work (examples are the transformation that Martin (S.) did via Sphinx, or what Hans did via Calibre/epub). Since this is done only once every 2 or 3 years, this seems to be acceptable. If an epub is generated (via Calibre) one additional step is needed: Epub is not good for a web page. However, the epub is just a zip-file containing standard html and image files. The essential point is that an additional file is present that defines the structure of the document. This file needs to be processed by an own script to generate a "table-of-contents" file which in turn has links to the different files and sections. One needs to evaluate whether the approach Word -> html -> Sphinx -> html is a more automated (better) one.

I should note that in order to convert Word into HTML/epub as sphinx format (or HTML), you need to change the wording on some things. You also manually need to mark words in the index, etc. As such, it is very unlikely that you would end up with the same semi-manually translated document from one Modelica release until the next. Also, whenever you have a bug in the text, you need to figure out if it came from the Word-document or the translation procedure and then fix it in one or more places in different documents.

All cross-references also need to be updated and it would be unlikely that "permanent" links are now permanent.

modelica-trac-importer commented 6 years ago

Comment by otter on 29 Jun 2015 14:50 UTC Replying to [comment:24 hansolsson]:

The attached ePub was constructed by simply doing (on a working copy of 3.4):

Save as docx from Word

Convert to ePub in Calibre; no special settings.

Just in case: A reasonable epub-reader is the following Firefox Addon (when clicking on an epub-link, it opens the book inside Firefox):
http://www.epubread.com/en/

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 29 Jun 2015 15:01 UTC Replying to [comment:33 otter]:

On first view, one critical part of re-structuredText are tables: It seems time-consuming to generate or change a table (adding one character in one cell, seem to require to make changes in all other rows).

I'm generally agreeing with this analysis - just wanted to clarify the following regarding tables:

Tables are useful for some things in the specification, e.g. equality, addition, multiplication, etc in Section 10.6. Basically everything where each table-row can be written on one (short) line.
Tables are not ideal for large texts - e.g. the Clock constructors in Section 16.3; and if we are moving from pdf as output format we should change that (as previously indicated).

I noticed that M.Sjölund wrote a comment at the same time, and to make it clear: I agree that any manual step creates problem but I hope we could use a fully automatic conversion (possibly with custom settings), i.e. we should replace those tables in the word-document with sub-sub-sections prior to any conversion - just to make the document better.

Doing that prior to any conversion also makes it easier to validate the conversion.

modelica-trac-importer commented 6 years ago

Comment by stefanv on 29 Jun 2015 15:10 UTC I think this argument that people who don't use the tool every day will forget how to use it is not relevant.

If you do use the tool every day, you will remember how to use it, and not waste any time looking things up.
If you use the tool rarely, then you will, by definition, rarely have to look something up (i.e. only when you use the tool, and an example of what you want isn't already right there in front of you, elsewhere on the page).

Do we really want to make a decision based on the very small amount (a minute or two per year) that a few people might have to spend looking something up?

This is minuscule compared to the large amounts of time wasted by implementers trying to find things in the current form of the spec.

If wedo want to make that argument, then I'll make the same one about Word. I rarely use Word. Whenever I want to do something even the slightest bit out of the ordinary, such as changing the attributes of a style, I have to go searching through menus and reading help pages.

At least with the reStructured text approach,everything I need to know fits on one page, so it won't take long to find.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 29 Jun 2015 15:58 UTC Replying to [comment:33 otter]:

Based on the three 3 basic requirements in comment:25, lets analysis the proposal to use Sphinx and github.

Martin, by removing 10 out of 13 requirements which others might actually find more important, you basically painting a whole new picture and the "evaluation" is bound to be incomplete. What happened for example to traceability and gate-keeping. Since we are currently busy also trying to vote on a clearer development process this are requirements which should be valued high. Bad enough that we currently can't even fill the position of a QA board member. So we should at least make sure that we can open up the specification document so that changes can be made traceable.

The proposal to use Sphinx in the future for the Modelica Specification is no option for me, because it would just mean that I (and probably others) can no longer contribute to the Modelica specification.

I respect this but I hope that we still let the community decide on this, right? Only you make it sound like if you don't like it it's not going to happen. And I think this should still be decided of the majority of the people involved. Also I really doubt that you won't be able to contribute to the specification because you might have to look up some mark-up which (as Stefan pointed out) might be summarized on less than one page. Pretty much in the same way I would doubt that you stop writing Word documents when they change the functionality (e.g., menus -> ribbons ;-)).

modelica-trac-importer commented 6 years ago

Comment by stefanv on 29 Jun 2015 17:06 UTC Something to keep in mind is the purpose of the specification.

Is it's purpose to be written, or to be read?

Ease of writing it is, in my opinion, secondary. Far more people spend far more time reading it, searching it, and trying to decipher it, than writing it.

A good analogy is software. It would be much easier to write a Modelica tool that simply accepts Modelica source code and runs a simulation. That saves the tool developer from having to write a GUI, look up things like Java APIs, write more complicated installers, and so on. However, software is written to be used, and the ease of use is always the primary goal (for production-quality software; clearly this is different for one-off throwaway scripts, where ease of writing is often more important). Once we start developing software with ease of use as the main goal, the developers start looking for things to make the writing easier, such as IDEs (the closest thing to a WYSIWYG editor for code), custom libraries for things like image viewers, etc.

So, and I can't emphasize this enough, the primary requirement should be that the specification is easy to _use_. That means easy to read, easy to navigate, and easy to search. Easy to write should be a "nice-to-have", not a requirement.

modelica-trac-importer commented 6 years ago

Comment by pharman on 30 Jun 2015 07:14 UTC Replying to [comment:29 dietmarw]:

* Go (from Google): http://tour.golang.org/welcome/1 (with online editing and execution) Source format: Markdown (see https://github.com/golang/tour)

* Rust (from Mozilla): http://rustbyexample.com/hello.html (with online editing and execution) Source format: Markdown (see https://github.com/rust-lang/rust-by-example)

* Dart (from Google): https://www.dartlang.org/docs/ Source format: Markdown (see https://github.com/dart-lang/www.dartlang.org/tree/master/src/site/docs)

* TypeScript (from Microsoft): http://www.typescriptlang.org/Handbook Source format: Markdown (see https://github.com/Microsoft/TypeScript-Handbook)

These examples show what is achievable with Markdown, for which the adoption is greater and tooling more plentiful than for rST. Therefore I would like Markdown to still be considered an option.

Replying to [comment:30 hansolsson]:

But the above links are all tutorials/handbooks, not specifications.

I don't see the difference, the audience is different but their need to be able to read and understand the content are the same.

Replying to [comment:41 stefanv]:

the primary requirement should be that the specification is easy to use

As well as easy navigation and searching of the document I think deep-linking into the document is critical for this. Accessing on any device without additional software helps. Word/ePub/PDF don't give us these, and I fear a workflow that involves translation from Word/ePub to HTML risks poor traceability between the generated HTML and the specific text it came from, something inherent in Markdown or rST.

Replying to [comment:19 hansolsson]:

* Integrated checks for grammar and spelling

No automated grammar checker can decide if the text is clear for the reader, that should be done by multiple people reading and agreeing to a change. Word isn't efficient for this kind of peer-review, especially now the document is so large and the MA is so big.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 1 Jul 2015 08:04 UTC I've invested some more time in the editor question. It seems to be the case that in principle the number editors that provide side-by-side preview and toolbars with short-cuts for all the different syntax features then Markdown offers a broader choice of those kind of editors.

Just as an example, for Windows you could use the free version of http://markdownpad.com/. Of course there a plenty of others around (e.g., http://www.sitepoint.com/best-markdown-editors-windows/).

As we noticed earlier (see also the feature matrix) the PDF output retrieved from the Markdown tool chains is not as powerful as what rst/Sphinx can give you. However if PDF is no longer of primary interest and HTML/ePub output is what we want the future Specification to be in (PDF output would of course still be available but not with all the bells and whistles) then Markdown might be the better candidate.

modelica-trac-importer commented 6 years ago

Comment by dag on 1 Jul 2015 08:30 UTC I think Dietmar's post is constructive, in a thread that I thought was going out of control.

A next step would be to update the proposal document in view of what has been discussed and to include both Word and Markdown in the analysis. There are valid alternatives to rst/Sphinx. The document in its current form is, frankly speaking, not useful for making an informed decision.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 1 Jul 2015 08:40 UTC Replying to [comment:43 dietmarw]:

I've invested some more time in the editor question. It seems to be the case that in principle the number editors that provide side-by-side preview and toolbars with short-cuts for all the different syntax features then Markdown offers a broader choice of those kind of editors.

Just as an example, for Windows you could use the free version of http://markdownpad.com/.

Looked nice, but I didn't test it yet.

However, technically it is only free for "personal use" - and even if Modelica Association is a non-profit organization that seems stretching "personal use" a far bit.

There is a commercial version for 15$ (per user) - compared to 100-200£ for Word; which has some nice additional features. I don't know if we have a minimum budget or an absolute zero cost requirement.

I did not test it, so I don't know how well it supports creating links - its faq talks about writing html-code for anchor - which seems quite primitive, and as previously indicated that is one of important GUI-features of the current process.

And to me it seems so obvious that if you make a GUI and add creating links to the FAQ that your next step is to consider having a good GUI for managing links inside the document (or between "linked" documents if we split the document into several parts) that at least someone should have added it to their tool - or at least is planning to, and it would be good if someone could investigate:

Of course there a plenty of others around (e.g., http://www.sitepoint.com/best-markdown-editors-windows/).

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 1 Jul 2015 09:10 UTC The problem with markdown is that all toolchains support different extensions. So math (equations) is input differently depending on which markdown dialect is used. I expect the same problem exists for the editors (since they might support a subset of the markdown dialect we settle on). As such, I think we would need to focus on a single markdown tool and figure out if it can do everything we need it to. The github renderer does not handle math (equations), for example.

The Typescript example translates everything into a single html file. I guess because it is such a pain to make references from one markdown file to a section in another.

I tried to find out how to link to section headers on stackoverflow, and this is the best I got (that works in most md dialects):

File1 (header+named anchor):

Heading title

File2:

Some text

I suspect that translating md sources to anything but HTML would be... annoying. Its syntax is good, but I always find it lacking when I want to do more things than write lightweight instructions for git repositories.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 1 Jul 2015 09:14 UTC Hans, that was just one of the editors available. I do not work with that tool since I don't see why anyone would need anything other than a proper text-editor anyway. But we are not here to discuss the tools that are totally up to personal taste. I just thought it helps giving people some pointers of what is available.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 1 Jul 2015 09:57 UTC Replying to [comment:47 dietmarw]:

Hans, that was just one of the editors available.

But looking through all the others listed in the linked article I couldn't see any other than clearly solved the issues in a better way (with one possible exception listed below); and many of the others were also commercial to various degrees.

And one problem, which M. Sjölund also wrote, is that each Markdown-tool implements a slightly (or completely) different Markdown-dialect, and thus we need to standardize on a dialect - or there will be confusion. This is similar to my previous comment about rendering differences between ninjst and sphinx.

As noted in your link there is an effort to standardize markdown-dialects - http://commonmark.org/ - (including a variant of links without html), and the open source editor http://mike-ward.net/markdownedit/ supports code-snippet which I assume we could use as a cheap way to customize the GUI - i.e. replacing the cheat-sheet.

I don't know how that standardization effort is going - especially if more tools will converge on the commonmark-spec, and since the version number is 0.20 a lot may change. If that spec is stable and more tools converge on it, then it might be a possible solution.

Added: It could be that the pandoc-variant listed below is instead becoming the de-facto standard.

modelica-trac-importer commented 6 years ago

Comment by otter on 1 Jul 2015 09:58 UTC I investigated yesterday evening and also recognized the many different markdown dialects. However, there seems to be one important, reliable tool, pandoc, that transforms from many input to many output formats, including html, html5, epub, docx. pandoc by default uses markdown with a large number of extensions and it seems that there are several tools that are based on pandoc (which is a command line tool). Note, usually markdown is only for one file/document, but pandoc-markdown can handle multiple files for one document.

A comparison between pandoc-markdown and restructuredText is here. I did not yet inspect carefully, but it seems that the expressive power is similar (and even if this would not be the case, pandoc-markdown has escape mechanisms, e.g., to include html code directly, that is utilized when exporting to html, html5 or epub, so sufficient for us).

I found the following useful GUI tools for pandoc-markdown:

http://www.writage.com This tool adds a plug-in to Word to import (subset) pandoc-markdown and save in (subset) pandoc-markdown. I tried and it is really nice, just open a pandoc-markdown document in Word and it looks like a Word document, changing it and saveing it (and then some formatting might get lost). The drawback is that not yet full pandoc-markdown is supported, so it is not sufficient yet for the Modelica specification. The tool is free.
http://www.texts.io This tool is a stand-alone GUI for (subset) pandoc-markdown. It is really nice and simple (like a very reduced office program). The most important pandoc transformations are available via a menu. E.g. export in html, epub, docx, latex, pdf. I quickly check and unfortunately cross-references (which is available in pandoc-markdown) is not yet supported. The tool has version number 0.23, so there is hope that it will be improved. This is a commercial tool with a 30-days trial version and 13 Euro cost (so negligible).
https://atom.io/packages/markdown-preview-pandoc The recently released atom editor (an open source standalone program from github that is based solely on html5/css3 technology which can be adapted in many ways) has also support for pandoc-markdown: One has to write the markdown textually, but clicking a specific keyboard shortcut uses pandoc to transform the markdown in html and render the html version. I only tired for a small document and this was quick and looked good.

modelica-trac-importer commented 6 years ago

Comment by henrikt on 2 Jul 2015 09:16 UTC I see the beginning of a constructive discussion about Sphinx versus pandoc. To support this discussion I think that it would be good if the feature matrix was updated to specifically have pandoc in the comparison instead of some unspecified dialect of Markdown.

For the feature rows where Sphinx and pandoc get different scores, it would then help with some illustration of how the feature is met with each tool, so that it is clear on what the scores are based. Then, if an experienced user of, say, pandoc knows a better way of meeting a requirement, the table should be updated with the better solution and a higher score.

Regarding Writage, besides my doubts that a conversion tool like this will be able to support all markup features that we need, I doubt that conversions to and from Word documents is going to produce useful diffs. That is, I don't think such tools are relevant for the workflows we want.