modelica / ModelicaSpecification

Specification of the Modelica Language

https://specification.modelica.org

Creative Commons Attribution Share Alike 4.0 International

102 stars 41 forks source link

MCP-0018: Change of the Specification Document Format #1730

Closed modelica-trac-importer closed 4 years ago

modelica-trac-importer commented 6 years ago

Modified by dietmarw on 2 Jul 2015 10:36 UTC This is a ticket discussion changing the documentation format from a binary documentation format to some sort of markup language.

The proposal is to use Sphinx; docbook, asciidoc, and markdown have been considered.

The following documents (also attached) show what a simple conversion during the meeting could accomplish: https://dev.openmodelica.org/~marsj/MLS-rst/html/ https://dev.openmodelica.org/~marsj/MLS-rst/singlehtml/ https://dev.openmodelica.org/~marsj/MLS-rst/latex/ModelicaLanguageSpecification.pdf

The sources (with verified working Windows installation instructions) are available at: https://github.com/sjoelund/MLS-rst-spec

Note that you do not need to install anything to modify the source code. Just to verify that the final output is what you expect it to be.

Document location

Modified by dietmarw on 12 Jun 2015 14:29 UTC This is a ticket discussion changing the documentation format from a binary documentation format to some sort of markup language.

The proposal is to use Sphinx; docbook, asciidoc, and markdown have been considered.

The sources (with verified working Windows installation instructions) are available at: https://github.com/sjoelund/MLS-rst-spec

Note that you do not need to install anything to modify the source code. Just to verify that the final output is what you expect it to be.

Document location

https://svn.modelica.org/projects/MCP/public/MCP-0018_ChangeSpecificationDocumentFormat/

Reported by sjoelund.se on 10 Jun 2015 04:09 UTC This is a ticket discussion changing the documentation format from a binary documentation format to some sort of markup language.

The proposal is to use Sphinx; docbook, asciidoc, and markdown have been considered.

The sources (with verified working Windows installation instructions) are available at: https://github.com/sjoelund/MLS-rst-spec

Note that you do not need to install anything to modify the source code. Just to verify that the final output is what you expect it to be.

Migrated-From: https://trac.modelica.org/Modelica/ticket/1730

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 2 Jul 2015 10:29 UTC Henrik, I've already started with a new table in the same spread sheet. See "Feature rst vs md" tab at the bottom. Here I started listing features that are currently needed by the Specification document and if they are supported or not (binary 1 and 0). Please feel free to add comments on features that should be added and checked by support. The nice thing about the comments is also that you can reply to them in that cell. So one can discuss the possibility directly in the table.

Mind the table is in no way complete yet. Especially the Markdown bit wasn't filled out yet.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 2 Jul 2015 10:30 UTC Replying to [comment:49 otter]:

I investigated yesterday evening and also recognized the many different markdown dialects. However, there seems to be one important, reliable tool, pandoc,

Oddly enough written by a philosophy professor.

I found the following useful GUI tools for pandoc-markdown:

* http://www.writage.com This tool adds a plug-in to Word to import (subset) pandoc-markdown and save in (subset) pandoc-markdown. I tried and it is really nice, just open a pandoc-markdown document in Word and it looks like a Word document, changing it and saveing it (and then some formatting might get lost). The drawback is that not yet full pandoc-markdown is supported, so it is not sufficient yet for the Modelica specification. The tool is free.

Technically just a free trial. It might later be commercialized. However, the important part is that pandoc itself is free - and defines the result.

* https://atom.io/packages/markdown-preview-pandoc The recently released atom editor (an open source standalone program from github that is based solely on html5/css3 technology which can be adapted in many ways) has also support for pandoc-markdown: One has to write the markdown textually, but clicking a specific keyboard shortcut uses pandoc to transform the markdown in html and render the html version. I only tired for a small document and this was quick and looked good.

It was literally released last week. Atom's major selling point is that it is extensible (with plugins etc); thus it might be that they add the things we need in the future - or they already exist.

And then something else: Replying to [comment:26 sjoelund.se]:

If the entire document is in html, it is possible to reformulate such references, like:

It is also possible to define functions and call them in a normal fashion. The function call syntax for both positional and named arguments is described in Section 12.4.1 and for vectorized calls in Section 12.4.4.

To something like:

It is also possible to define functions and call them in a normal fashion, using :ref:both positional and named arguments <section-funcall-pos-named-arguments> and using :ref:vectorized calls <section-funcall-vectorized>.

I must say all those numbers mostly clutter the output though.

This example made me realize a minor issue. I understand that it is just an example, but those references will be visible for users, and should ideally be as similar to the actual section named as possible - and we don't want different abbreviation styles. E.g. some might have used 'funccall' not 'funcall', and 'args' not 'arguments'. The simplest way to avoid different abbreviation styles is to not abbreviate - especially since the non-abbreviated name can be (automatically) generated.

In this case the section is called 'Positional or Named Input Arguments of Functions' (as a sub-section of 'Function Call' in the chapter 'Functions').

If we want to reduce the length I could understand that we skip some common words, or limit the number of words - as long as those rules are applied consistently, but otherwise just keep it as is (i.e. prefer the first one below): section-positional-or-named-input-arguments-of-functions section-positional-named-input-arguments-functions section-positional-named-input-arguments (assuming spaces are not supported).

Note that pandoc, and likely other tools, have support for automatically generating links for headers using some extensions (not fully consistent), and would generate: positional-or-named-input-arguments-of-functions - that seems nicer.

The section headings are sometimes changed - so these links are not necessarily stable, one way of handling that situation is to just add new anchors if the heading is changed - and keep the old one for existing references, even if that will pollute the document with old references. (Assuming the formats support multiple anchors for one heading in some way.)

BTW: Regarding the actual text, I assume this is also possible for pdf (basically any hyper-text-format) and not just html. I agree that it is nice when the link is as clear as in those examples (and that this is likely the majority of the cases), i.e. when the section-name and the "link-text" match - but in some other cases that is not as clear.

The good cases (which I believe are the majority) are also exactly the cases where named links look duplicated.

To me it seems simplest to first use numbers as default for all links (i.e. without any link-text) and then convert individual links to use link-texts instead (when appropriate) - and finally revisit the remaining links: I find it odd to read "section Vectorized Call of Functions" and would prefer "section on Vectorized Call of Functions". If we want to support non-hyper-text variants I assume it will be possible to append the section number automatically to all links. Basically view the numbers as a better variant of the old "here", and then fix them.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 2 Jul 2015 10:36 UTC Added a link to Google spread sheet to ticket description.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 3 Jul 2015 09:30 UTC Replying to [comment:52 hansolsson]:

Atom's major selling point is that it is extensible (with plugins etc); thus it might be that they add the things we need in the future - or they already exist.

Just for completeness, Atom.io also offers an rst-extension that gives you a live preview.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 3 Jul 2015 12:52 UTC Replying to [comment:54 dietmarw]:

Replying to [comment:52 hansolsson]:

Atom's major selling point is that it is extensible (with plugins etc); thus it might be that they add the things we need in the future - or they already exist.

Just for completeness, Atom.io also offers an rst-extension that gives you a live preview.

How well does it work?

The reason I ask is that Pandoc itself supports reStructuredText (as input and output; as well as markdown, docx, html etc) and references http://docutils.sourceforge.net/rst.html - but is not directly able to fully handle the current documents - in particular the ref-parts (and I couldn't find ref in the referenced specification).

Based on the link I would assume Atom uses Pandoc's rst-support, with the same limitations.

What I now understand is that the currently converted document is not pure reStructuredText - but reStructuredText+Sphinx extensions; i.e. some of the earlier confusion is because we don't only have markdown-dialects, but also reStructuredText-dialects - and thus we cannot use every reStructuredText tool to handle the document in a good way (similarly as we cannot use every markdown-tool).

One question is thus how important those extensions are, and if they can be disabled - I noticed that Pandoc can disable markdown-extensions (in a flexible way - which would reduce the tool-lockin), of course, with reduced functionality. For some of Pandoc's markdown-extensions it seems that an optional warning would be more appropriate (e.g. requiring a blank line before # for headers).

Added: I also find it ironic that we are considering solutions that rely on file-formats that are only fully supported in one tool - and at the same time more and more tools support the Word-format (docx), and Word is now getting good support for OpenOffice files.

Forgot to add that the reason I found that problem was a similar experiment as the following - I still doubt that it is useful enough to send to the entire group: Replying to [comment:50 henrikt]:

Regarding Writage, besides my doubts that a conversion tool like this will be able to support all markup features that we need, I doubt that conversions to and from Word documents is going to produce useful diffs. That is, I don't think such tools are relevant for the workflows we want.

I agree with your doubts, but found it worth a try with Pandoc itself.

I tried converting the full spec to markdown (it just takes a few seconds, but it doesn't fully work; in particular headings and cross-references need to be corrected) - changed a few things in the markdown (in particular changed some of the headings to usual markdown), and generated a Word-document using the current document as style-reference - and then converted that back to markdown.

I had some minor hope for a miracle, but was expecting a total mess. The result was in-between: only 20 lines of diff (for the entire 700k document).

However, it might be that Pandoc normalizes the document during this process (maximum line length etc), but if Writeage would correct the remainder it would be amazing.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 6 Jul 2015 10:37 UTC I gave this some additional thought during the week-end, and realized something: These intuitive markdown languages are good for getting started, and adding lists and external hyperlinks, but then it becomes less and less intuitive (consider # Heading # or ---- below the line, images, and internal references), and they sort of give up when it comes to math.

This has caused ambiguity and proliferation of dialects, since none of those extensions are as obvious as the basis. It also seems that the creator of Markdown is against standardization, since he believes that different users have different requirements.

We could go on arguing about exactly those details, but since we are language designers the obvious question is if someone has designed a markdown language that strictly defines the output, supports large chunks of plain text, hyperlinks, Modelica-code, math, generation of pdf, html, ePub etc - while still having a large user-base (and preferably with multiple tools for input with syntax highlighting and even something wysiwyg-like).

If we consider a clear markdown-language it seems we have forgotten to fully investigate the obvious choice, which we know, - where you use \chapter instead of # ...#, \section instead of ## .... ##, \href instead of just plain hyper-link and \ref for internal references, and unfortunately the slightly less intuitive \item instead of * for list items.

Or in other words a limited sub-set of LaTeX (the commands above, tables, mages, code-listings, something for the title-page and examples, but nothing more outside of the math-parts) - the downside is that we will have to construct a short pre-amble for the document (which packages to use - and customize a few things to remove some ugliness like color-boxes around hyperlinks), and possibly customize the mapping to HTML if we use LaTeXML, http://dlmf.nist.gov/LaTeXML/manual/ . Note that some of the tools ignore the TeX-language - and only focuses on the document structure; this is possible since the document defines the structure and we are primarily interested in the structure - not the type-setting.

I know that it was dismissed earlier due to two issues:

Ease of use. However, in my opinion this is not ease-of-use but instead ease-of-setting-up, since we at the start have to create the pre-amble, and controlling output-options - or nothing works. If that is done, I don't see any actual problem for people editing the document - the only minor annoyance compared to intuitive markdown choices is that they have to use a cheat-sheet, GUI, or copy-paste for list items and hyper-links.
HTML-quality. I haven't seen the complete investigation of this - i.e. if this is just a matter of using a more fine-grained control of the output (e.g. what to split at). I simply don't see this as caused by the specific input-formats, but as a consequence of the lack of ease-of-setting-up.

I'm aware that you might need more than a cheat-sheet for the math, but since markdown-languages often rely on LaTeX for formulas I don't see that as a problem.

So, something like:

\chapter{Introduction}
\section{Overview of Modelica}

Modelica is a language for modeling of physical systems, designed to
support effective library development and model exchange. It is a modern
language built on acausal modeling with mathematical equations and
object-oriented constructs to facilitate reuse of modeling knowledge.

\subsection{Scope of the Specification}

The semantics of the Modelica language is specified by means of a set of
rules for translating any class described in the Modelica language to a
flat Modelica structure. A class must have additional properties in
order that its flat Modelica structure can be further transformed into a
set of differential, algebraic and discrete equations (= flat hybrid
DAE). Such classes are called simulation models.

...
The key issues of the translation (or flattening) are:

\begin{itemize}
\item
  Expansion of inherited base classes
\item
  Parameterization of base classes, local classes and components
\item
  Generation of connection equations from connect-equations
\end{itemize}

I'm not saying that this looks like the most modern solution, only that it seems like a stable and working solution.

modelica-trac-importer commented 6 years ago

Comment by choeger on 6 Jul 2015 10:55 UTC The problem with LaTeX, as with Word, is that it is essentially a typesetting tool (called TeX in that case). These tools don't maintain lists of sections/paragraphs/boxes but render them. It is true that LaTeX offers some (seemingly) semantic annotations like chapters, references etc., but these are in fact just TeX macros, which means we would have to maintain that limited set of annotations in the form of their TeX implementation (especially when it comes to html/epub output). There would probably be a lot of effort involved to get good html/epub output.

Also note that TeX has no context-free syntax. This means that there is no way to enforce a "limited subset" of TeX in a document short of providing a limited TeX implementation. It may not be necessary to have that process in a automatic fashion, since the document is singular, but the process of generating good-quality html from TeX sources is still a tricky one.

Conclusion: While TeX and especially LaTeX is certainly superior to Word in many regards (output-quality, packages, platform support), the effort to maintain the html generation would probably come down to implement our own markdown-language.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 6 Jul 2015 12:28 UTC Let's consider some ways of converting LaTeX to HTML/sphinx/markdown using pandoc. A simple document like:

\documentclass{paper}

\begin{document}

\section{Sec}\label{mysec}

This is text in \ref{mysec}.

\end{document}

This is translated to markdown and HTML with a non-working reference to mysec, but a working anchor:

Sec {#mysec}
===

This is text in \[mysec\].

<h1 id="mysec">Sec</h1>
<p>This is text in [mysec].</p>

This is translated to sphinx with non-working reference and anchor:

Sec
===

This is text in [mysec].

htlatex (a tex/latex processor generating HTML) seems to work for this example. I have used in the past and the results were not as impressive at that time (it produced unreadable output):

   <h3 class="sectionHead"><span class="titlemark">0.1   </span> <a 
 id="x1-10000.1"></a>Sec</h3>
<!--l. 7--><p class="noindent" >This is text in <a 
href="#x1-10000.1">0.1<!--tex4ht:ref: mysec --></a>.

Output from hevea (note the mysec anchor does not point to the header):

<h2 id="sec1" class="section">0.1&#XA0;&#XA0;Sec</h2><!--SEC END --><p><a id="mysec"></a></p><p>This is text in <a href="#mysec">1</a>.</p><!--CUT END -->

latex2html generates ok links, but its html_version flag only accepts HTML 2.0|3.0|3.2, which is ancient. I suspect the output would not be visually pleasing.

latexmlc generates the following (plus some additional div tags/etc):

<section id="S1" class="ltx_section">
<h1 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">1 </span>Sec</h1>

<div id="S1.p1" class="ltx_para">
<p class="ltx_p">This is text in <a href="#S1" title="1 Sec" class="ltx_ref"><span class="ltx_text ltx_ref_tag">1</span></a>.</p>
</div>
</section>

Of course, one problem is that you in LaTeX will generate references to section numbers only. So you need to write your text in a way that conforms to section numbers instead of a layout more suitable to HTML (where you separate the link and text if you want to make something sound more natural and assume the target format supports linking).

Out of these outputs (not including images, equations, etc), I would say hevea is the nicest LaTeX tool (it does not rename the anchor id's). But I am unsure how good the output is for more complicated documents.

For comparison, the Sphinx output is:

<div class="section" id="sec">
<span id="mysec"></span><h1>Sec<a class="headerlink" href="#sec" title="Permalink to this headline">¶</a></h1>
<p>This is text in <a class="reference internal" href="#mysec">mysec</a>.</p>
</div>

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 6 Jul 2015 12:29 UTC Replying to [comment:57 choeger]:

The problem with LaTeX, as with Word, is that it is essentially a typesetting tool (called TeX in that case).

TeX is a programmable typesetting tool. I agree it is not suitable for our purposes.

LaTeX is about the semantics of the text - traditionally implemented on top of TeX - but we shouldn't get stuck on that implementation detail. As far as I understand several tools for processing LaTeX handle it completely differently, e.g. LaTeXML.

Obviously if they use a different implementation we need to verify that the tools work for our intended use (or find a more restricted sub-set).

Conclusion: While TeX and especially LaTeX is certainly superior to Word in many regards (output-quality, packages, platform support), the effort to maintain the html generation would probably come down to implement our own markdown-language.

My point is that we haven't investigated it fully enough to conclude this, especially considering that there are a number of tools generating html (and ePub) from LaTeX - and thus we don't have to invent that. (And PDF just works.) They can usually also be configured in various ways.

I admit that I don't like the color-scheme and boxes looking at http://dlmf.nist.gov/LaTeXML/manual/usage/ but I would assume it can produce more traditional styles as well.

And even if we are not satisfied with using the solution right away, and implement something of our own for LaTeX I wouldn't see what we do as our own markdown-language, but:

Our own style (hopefully we can copy an existing one).
A few commands of our own, e.g. \example{} to mark the examples if we don't find something existing. I know that people discussing domain-specific languages sometimes even consider a few functions as defining a new language, but in that case the implementation effort will be minimal.

An additional benefit is also that we want to promote Modelica and many journals still rely on LaTeX-sources as far as I know. Thus any improvements in e.g. the syntax highlighting for Modelica will also benefit those articles.

Obviously there are things to consider, e.g. \nameref can be used as an alternative to \ref - since we use hyperlink package.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 6 Jul 2015 13:03 UTC Replying to [comment:58 sjoelund.se]:

Let's consider some ways of converting LaTeX to HTML/sphinx/markdown using pandoc.

I agree that Pandoc doesn't seem to be working well with LaTeX (both input and output), and thus have not considered it more. (Could be that I am not using it correctly, but I doubt that.)

I hope there is a better tool for converting from Word; because there will otherwise be a lot of clean-up needed.

htlatex (a tex/latex processor generating HTML) seems to work for this example. I have used in the past and the results were not as impressive at that time (it produced unreadable output):
   <h3 class="sectionHead"><span class="titlemark">0.1   </span> <a 
 id="x1-10000.1"></a>Sec</h3>
<p class="noindent" >This is text in <a 
href="#x1-10000.1">0.1</a>.
Output from hevea (note the mysec anchor does not point to the header):
<h2 id="sec1" class="section">0.1&#XA0;&#XA0;Sec</h2><p><a id="mysec"></a></p><p>This is text in <a href="#mysec">1</a>.</p>
latex2html generates ok links, but its html_version flag only accepts HTML 2.0|3.0|3.2, which is ancient. I suspect the output would not be visually pleasing.

latexmlc generates the following (plus some additional div tags/etc):
<section id="S1" class="ltx_section">
<h1 class="ltx_title ltx_title_section">
<span class="ltx_tag ltx_tag_section">1 </span>Sec</h1>

<div id="S1.p1" class="ltx_para">
<p class="ltx_p">This is text in <a href="#S1" title="1 Sec" class="ltx_ref"><span class="ltx_text ltx_ref_tag">1</span></a>.</p>
</div>
</section>
Of course, one problem is that you in LaTeX will generate references to section numbers only.

As indicated in my previously reply \nameref can be used instead of \ref (at least for pdf-output - but would hope these tools also support it); and as I indicated even earlier I would prefer if we gradually convert to named references afterwards - since the current text is written assuming numbered references.

I had to search for that command (it wasn't obvious).

There might similarly be other options for improving the html - possibly including more readable anchors (I assume they might be renamed due to problems with valid letters in anchors), further indicating the problem of ease-of-setting-up.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 6 Jul 2015 14:11 UTC Replying to [comment:56 hansolsson]:

preferably with multiple tools for input with syntax highlighting and even something wysiwyg-like

I tried the latex to html converters I know of and neither seems to syntax highlight code using the listings or minted packages (minted 2.0 now supports breaking lines, so the pdf output looks very good for LaTeX). Pandoc came closest, but it just create a "<code class="modelica>..." or something like that.

So there exists LaTeX packages for syntax highlighting, but I suspect they use TeX in the end, so they end up not working for the HTML export.

Pandoc latex to sphinx to html generates good code listings, but of course the cross-references are missing...

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 6 Jul 2015 15:23 UTC Replying to [comment:61 sjoelund.se]:

Replying to [comment:56 hansolsson]:

preferably with multiple tools for input with syntax highlighting and even something wysiwyg-like

I tried the latex to html converters I know of and neither seems to syntax highlight code using the listings or minted packages (minted 2.0 now supports breaking lines, so the pdf output looks very good for LaTeX). Pandoc came closest, but it just create a "<code class="modelica>..." or something like that.

Just to be clear: above I meant syntax highlighting/wysiwyg for the LaTeX source. I'm not saying that people have to use them, only that it is nice to have.

I had not considered the syntax highlighting of Modelica code in the generated html, and not even exactly how to configure Modelica syntax highlighting at all in LaTeX; only that Modelica was on the list of supported languages in LaTeX (actually with some special footnote - didn't look through the details).

Looking at LaTeXML it seems there is some support for special formatting of keywords using the listing-package, http://math.nist.gov/~BMiller/LaTeXML/manual/cssclasses/ but we might need to configure two or three CSS-classes for it to work as we want for Modelica for the generated HTML. (I have not tested it.) Since LaTeXML has been used for a lot of math-papers I assume it has at least handled some pseudo-code examples.

As an example of listing generated from LaTeXML see https://www.authorea.com/users/5713/articles/28015 (I'm not saying that we have to use that tool.)

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 6 Jul 2015 15:58 UTC OK. latexmlc does support lstlistings in some way. The output is something like this:

<td class="ltx_td"><span class="ltx_text ltx_lst_line ltx_font_typewriter"><span class="ltx_text ltx_lst_space">  </span>Real<span class="ltx_text ltx_lst_space"> </span>r<span class="ltx_text ltx_lst_space"> </span>=<span class="ltx_text ltx_lst_space"> </span>2.0<span class="ltx_text ltx_lst_space"> </span>"<span class="ltx_text" style="color:#228B22;">some<span class="ltx_text ltx_lst_space"> </span>long<span class="ltx_text ltx_lst_space"> </span>thing<span class="ltx_text ltx_lst_space"> </span>herea<span class="ltx_text ltx_lst_space"> </span>asdffasdfa<span class="ltx_text ltx_lst_space"> </span>saf</span></span></td>

Yes, the colours are hard-coded (as they are in the listings-modelica.cfg). So CSS would not help here, although tweaking the listings config would work. The listings package is a bit limited since it does the syntax highlighting using latex macros only. The minted package is based on pygments and produces nicer syntax highlighting.

At least the default output of LatexML equations is not supported in chrome (Presentation MathML):

<div id="S1.p4" class="ltx_para">
<p class="ltx_p"><math id="S1.p4.m1" class="ltx_Math" alttext="x^{2}" display="inline"><msup><mi>x</mi><mn>2</mn></msup></math></p>
</div>

The fallback is png images for what should be readable formulas. Not so nice, but could probably be hacked to create SVG's by post-processing the file (the latex formula is part of the alt tag).

modelica-trac-importer commented 6 years ago

Comment by choeger on 6 Jul 2015 16:27 UTC Replying to [comment:59 hansolsson]:

TeX is a programmable typesetting tool. I agree it is not suitable for our purposes.

LaTeX is about the semantics of the text - traditionally implemented on top of TeX - but we shouldn't get stuck on that implementation detail. As far as I understand several tools for processing LaTeX handle it completely differently, e.g. LaTeXML.

It seems to me that you have a wrong mental image about TeX/LaTeX. TeX is not a "programmeable typesetting tool" - it is a Turing-complete language for typesetting. LaTeX is not "traditionally implemented on top of TeX", it is a TeX (macro-) library. Please keep in mind that there is no declarative semantics of LaTeX - the implementation is the semantics. Hence, while there is a certain meaning behind commands like \section, there is no way to obtain the semantic structure of a document (short of rendering this document). We could -of course- define a set of commands that implement any semantics we want, but that comes with some considerable effort.

* A few commands of our own, e.g. \example{} to mark the examples if we don't find something existing.

To make that clear: LaTeX consists of said "few commands".

modelica-trac-importer commented 6 years ago

Comment by otter on 7 Jul 2015 06:10 UTC Replying to [comment:64 choeger]:

Replying to [comment:59 hansolsson]:

TeX is a programmable typesetting tool. I agree it is not suitable for our purposes.

LaTeX is about the semantics of the text - traditionally implemented on top of TeX - but we shouldn't get stuck on that implementation detail. As far as I understand several tools for processing LaTeX handle it completely differently, e.g. LaTeXML.

It seems to me that you have a wrong mental image about TeX/LaTeX. TeX is not a "programmeable typesetting tool" - it is a Turing-complete language for typesetting. LaTeX is not "traditionally implemented on top of TeX", it is a TeX (macro-) library. Please keep in mind that there is no declarative semantics of LaTeX - the implementation is the semantics. Hence, while there is a certain meaning behind commands like \section, there is no way to obtain the semantic structure of a document (short of rendering this document). We could -of course- define a set of commands that implement any semantics we want, but that comes with some considerable effort.

You are right, TeX is a programming language (and in principal you can program not only for typesetting but you can also implement numerical algorithms). But, LaTeX has a list of commands with (a) a documentation about its semantics and (b) an executable specification (the implementation with TeX). The markup languages have only the first part: a (spare) documentation about the meaning of the keywords (I did not yet see a formal semantic definition of standard-markdown, pandoc-markdown, or Sphinx). So, the semantics in these languages is not well-defined. You can view the converter programs (to html, latex, etc.) as an executable specification. Since there are several conversions the fine semantic details are problably not identical.

Since TeX is a programming language, the conversion to html alone cannot be complete. It might be possible to convert all TeX programs/macros to html5+Javascript. However, for the LaTeX commands that we would like to use, the conversion should be simple (e.g. \section is transformed to

...

; so macro calls with their arguments are mapped to html constructs). However, all this is just theory and does not help much, if no reasonable tools are available to transform LaTeX to html. As I understand the previous discussion, such a tool is not yet known. I have also searched for it, but didn't find anything reasonable. If someone can recommend such a conversion tool (LaTeX -> html), please provide this information.

modelica-trac-importer commented 6 years ago

Comment by otter on 7 Jul 2015 06:18 UTC Let me summarize additional investigations:

DocOnce

Besides Sphinx and pandoc-markdown, there is a third markup-language called DocOnce. The author, Hans Petter Langtangen, explains here

http://hplgit.github.io/doconce/doc/pub/slides/scientific_writing-1.html

why Sphinx, pandoc-markdown, LaTeX and other approaches are not sufficient for his needs. He argues that he wants to have one document source and generate from this source (a) a book in high (LaTeX) quality and (b) a web-site in high (html) quality, including modern (responsive) web design. He argues that this is not possible with any solution he knows (including Sphinx, pandoc-markdown and LaTeX) and therefore he invented a new markup-language and implemented the needed converters in Python.

He used DocOnce to write a 900 page Springer book (A Primer on Scientific Programming with Python). I have this book in my office, and considered it previously as a (high quality formatted) LaTeX-based book, but now learned that it was actually written in DocOnce. So, this is a practical proof that this language has everything what is needed to write (a) a scientific book and (b) a web page (there are many examples on the DocOnce web page).

The drawbacks are:

No WYSIWYG whatsoever, even no limited editor support (like pandoc markdown in Atom editor).
Hard to install (many Python packages are needed; I did not try)

Other approaches

Since I would like to have a WYSIWYG editor, I started to investigate the features of several WYSIWYG environments and how they can generate html, maybe in combination with some command-line converters. Here is a short summary:

Word (docx)

Exported in Word "html (filtered)": Bad html and the cross-referencing is lost in the html-file.
Converted docx with pandoc to epub (and then unzipped). Section numbers and cross referencing are lost in the html-files. Then tried pandoc -N option (generate heading numbers). However, this gives an error, and no epub is generated. Tried to generated pandoc-markdown from docs, but the section numbers and cross referencing is also lost. The -N option is ignored when generating markdown. When generating pandoc-markdown directly from docx, the images are lost. In all cases, the appendix from the Modelica specification is not included in epub or pandoc-markdown
Converted docx with Calibre to epub. The cross-references are kept, but everything (including heading, ul, ol tags etc) are transformed to div with many classes (like class = "calibre1"). Modifying the (automatically generated) css style sheet is not practical (and would change for every new version of the Word document).

Summary: My proposal in comment 34 to use Word as source and then for every new Modelica release build an html version from it, seems to be not practical according to this investigation. So, I withdraw this proposal.

odt text processors

I tried OpenOffice, LibreOffice, Calligra (has epub output), Abiword, see also (https://en.wikipedia.org/wiki/Comparison_of_word_processors). The result is just a mess. Some short comments:

OpenOffice generates wrong html (e.g.
All generate bad html, section headings and cross-references are lost.
OpenOffice and LibreOffice export images and the document in the same directory.
Calligra and Abiword are unstable and easily crash (for the Modelica specification).

So, I give up and conclude that none of these tools can be used as document source (in order to generate html).

I was frustrated and started to think out-of-the-box. I have now found two solutions that fulfill the three core requirements listed in comment 25. This is described in the next submission.

Dietmar pointed out that some important requirements, like traceability, are missing. Sorry, this was not described thoroughly enough. What I meant with requirement "collaboration" is everything what is needed for it, including version handling. To make this clearer, the third requirement is now listed as "Collaboration and version handling" (and traceability is included here).

modelica-trac-importer commented 6 years ago

Comment by otter on 7 Jul 2015 06:45 UTC Documentation with a small subset of html5, css3, and Javascript

The proposal is to use a small subset of html5, css3, and Javascript, so that "source" and "html" is identical. I have uploaded a "proof-of-concept" of the first chapters of the Modelica Specification, see above (similarly as Martin S. did it with a Sphinx solution). Please, first have a look to this file, before continuing. Open any *.xhtml in a web browser (I tested with Firefox). The start page is _cover.xhtml.

The small subset needs to be documented (have not done it yet). Basically, these are the re-structuredText equivalents in html5 (e.g.

for a paragraph,

...

for a heading,

Modelica examples are included in a

environment with just native Modelica code. A small Javascript program transforms such a pre-section (online) into highlighted Modelica code.

Tables and figures are included with standard html commands including "caption". Equations are included with MathJax (did not yet do it here, since one needs to investigate the many different options). Note, standard mathml code is just a mess, and far away from the simplicity of MathJax and the MathJax rendering quality.

To arrive at the desired xhtml source, one needs to generate "clean" html from the current Word source. I did this in the following way:

Used pandoc to transform docx to epub ("pandoc _ModelicaSpec33Revision1.docx -f docx -t epub3 -o _ModelicaSpec33Revision1.epub")
Unzipped the epub file
Run html5-tidy (http://www.htacg.org/binaries/binaries/tidy-5.0.0.RC1/tidy-5.0.0-win32.zip) on the desired chapter files generated by pandoc: tidy -config tidy_config.txt -c -o chapter_01.xhtml ch001.xhtml config.txt is a text-file with all the options to be used. I used the following file:

indent: no
indent-spaces: 2
wrap: 80
markup: yes
output-xml: no
input-xml: no
show-warnings: yes
numeric-entities: yes
quote-marks: yes
quote-nbsp: yes
quote-ampersand: no
break-before-br: no
uppercase-tags: no
uppercase-attributes: no
char-encoding: utf8
drop-font-tags: yes
drop-proprietary-attributes: yes
lower-literals: yes
merge-spans: yes
replace-color: yes
show-body-only: no
vertical-space: yes
logical-emphasis: no
clean: yes

The resulting xhtml file is quite "clean" now.

The remaining part is done manually. I used a few notepad++ macros (generated by recording them) to transform e.g.
to "ul" or "ol". Modelica code text is completely useless in the generated files. Copied this manually from the Word document into the text editor.

The essential missing thing are the heading numbers and the cross referencing to section numbers. This cannot be described in html. I did this manually in the "proof-of-concept" document. A program needs to be implemented that performs this automatically. Here is a sketch:

On a file the filenames and the order of the files is defined (e.g. that _preface.xhtml is before chapter_01.xhtml). Every file contains exactly one chapter of the Modelica specification.
A program reads this file and inspects all the mentioned files and collects all headers, and captions of tables and figures in the right order.
The program defines all the ordered section, figure, table numbers and copies these numbers back into html-elements and in cross references, and saves the files.

This functionality could be easily implemented in Javascript by inquiring the information from the DOM and changing the DOM. However, I do not know whether it is possible to store the modified DOM back on file (apparently, this is not possible with Javascript alone). If this is not possible, one has to implement this in another language (but should be also a small task, say 1-2 days effort).

Analysis of the proposal:

1. Good HTML

As demonstrated by the "proof-of-concept" part of the specification, the source is clean xhtml (and the simplest html, I have seen from all experiments, including html generated by Sphinx). The rendered web pages in this demonstration are simple. A professional web designer could easily improve the css definitions to get a nicer rendering.

2. WYSIWYG

There are the following possibilities:

Use any editor and directly edit the source (in case one can write html). Drag the file to a Web browser like Firefox and see the immediate rendering of this text file. Whenever, something is changed in the file, just click in the browser on "reload". This gives a reasonable quick feedback of the rendering.
Use the Atom text editor with the plug-in atom-html-preview (https://atom.io/packages/atom-html-preview). In this case, there is an online rendering of html5 with Javascript (so one sees nearly in real-time the rendered change when the xhtml text is modified)
Use the free, GNU open source web editor BlueGriffon (https://atom.io/packages/atom-html-preview). It is available for Windows, Linux, Mac. For Windows, only a zip-file needs to be downloaded (no administrator rights needed). The editor allows a Word-like input, or a pure textual input (in the same way as a text-editor with html syntax highlighting), and it is possible to switch between the two. I tried it quickly and it looks good for me (
,

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 7 Jul 2015 06:55 UTC Martin, could you fill in the support of the different features in https://docs.google.com/spreadsheets/d/1UUqRJaaby6WobynahdGBXgYcQwPo3lxFKnQdtSTEXsU/edit#gid=807447325. Feel free to add some that you found and others can check the support in the other formats.

modelica-trac-importer commented 6 years ago

Comment by otter on 7 Jul 2015 06:59 UTC Due to a last minute change there was a small formatting error in _cover.xthml. Corrected and uploaded the file ModelicaSpecification_3.3+rev.1_html_draft.zip again.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 7 Jul 2015 08:33 UTC Replying to [comment:65 otter]:

Since TeX is a programming language, the conversion to html alone cannot be complete. It might be possible to convert all TeX programs/macros to html5+Javascript. However, for the LaTeX commands that we would like to use, the conversion should be simple (e.g. \section is transformed to
...
; so macro calls with their arguments are mapped to html constructs). However, all this is just theory and does not help much, if no reasonable tools are available to transform LaTeX to html. As I understand the previous discussion, such a tool is not yet known. I have also searched for it, but didn't find anything reasonable. If someone can recommend such a conversion tool (LaTeX -> html), please provide this information.

As far I understand LaTeXML http://dlmf.nist.gov/LaTeXML/ does a fairly good job of that (technically it converts to XML and then XML->HTML). The HTML contains a lot of div/class to be able to configure the layout, so the output isn't "clean" - but that's usual for modern html. Other possibilities would be hevea and htlatex.

I do understand that there will be a bit of work configuring LaTeXML - and, of course, the LaTeX-layout. Once that is done the text can be added easily.

The remaining issue do not seem that severe to me:

Something for the listings cause colors to be hard-coded (but it seemed possibly to change that configuration in some way).

The presentation MathML does not work in Chrome, but LaTeXML also has SVG, PNG which should work (and experimental OpenMath and "Content MathML" support - which I don't know how it differs from the default "Presentation MathML").

I'm not saying that LaTeX is ideal, just that it is clearly defined and avoids many of the other problems - in particular I don't want us to be locked into specific tools for the specification, since we want the document to live for many years.

I understand that DocOnce might thus be a solutions (even if a single tool) due to the divorce-clause on http://hplgit.github.io/doconce/doc/pub/slides/scientific_writing-1.html (assuming it is actually true).

Added here since I don't view this as important for the entire design group: Replying to [comment:64 choeger]:

Replying to [comment:59 hansolsson]:

TeX is a programmable typesetting tool. I agree it is not suitable for our purposes.

LaTeX is about the semantics of the text - traditionally implemented on top of TeX - but we shouldn't get stuck on that implementation detail. As far as I understand several tools for processing LaTeX handle it completely differently, e.g. LaTeXML.

It seems to me that you have a wrong mental image about TeX/LaTeX.

Could we please try to be constructive?

I have a different mental image than you - but the same reality can be viewed in different ways. It's the same with models - there are multiple ways of modeling e.g. a car, and those different models have different uses.

TeX is not a "programmeable typesetting tool" - it is a Turing-complete language for typesetting.

TeX can refer both to the program (or tool, app, or what-ever name people use today) and to the input (i.e. the TeX-language). I only care that the language allows you to program in order to configure the result - not about the complexity of the language.

And you previously called TeX a "typesetting tool".

LaTeX is not "traditionally implemented on top of TeX", it is a TeX (macro-) library. Please keep in mind that there is no declarative semantics of LaTeX - the implementation is the semantics. Hence, while there is a certain meaning behind commands like \section, there is no way to obtain the semantic structure of a document (short of rendering this document).

That mental image only works well if \section always have one unique definition giving one result - regardless of latex-version, journal-specific settings, and whether you are writing a book, report, article, or ...

I don't know if that is true, but I do know that considering \section as a section (with the text giving the heading) and ignoring the details of the implementation is a mental image promoted by the designers of LaTeX.

Note an important difference compared to some other languages: \section defines a section - with a text giving the heading. In e.g. HTML (and in Word) the corresponding command defines the heading, and the section-structure is implied. I find the LaTeX command gives a better understanding of the intended structure of the document.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 7 Jul 2015 08:51 UTC Replying to [comment:70 hansolsson]:

* The presentation MathML does not work in Chrome, but LaTeXML also has SVG

It actually only does in the very latest release. Even Ubuntu 15.04 latexml did not have SVG support. I have now installed the latest one and tried the svg export, which is sub-par. The bounding box of the svg clips some of the equations. I will attach a few examples of what it outputs.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 7 Jul 2015 08:59 UTC I.e.,

x²:

dx/dt:

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 7 Jul 2015 09:20 UTC Replying to [comment:71 sjoelund.se]:

Replying to [comment:70 hansolsson]:

* The presentation MathML does not work in Chrome, but LaTeXML also has SVG

It actually only does in the very latest release. Even Ubuntu 15.04 latexml did not have SVG support. I have now installed the latest one and tried the svg export, which is sub-par. The bounding box of the svg clips some of the equations. I will attach a few examples of what it outputs.

Ok, then we might have to use PNG for the time being.

Note that the SVG-output from Sphinx also has clipping issues (at least with Chrome and IE) - depending on zoom factor etc for the top and/or bottom of one of the \partial of the equation in "Equations are converted to nice vector graphics:". I already reported that in comment 19.

Zoom matters, e.g. https://trac.modelica.org/Modelica/attachment/ticket/1730/dxdt.svg looks perfectly ok if zoomed to 345% in IE (but clips at 355%).

I'm not suggesting that as a work-around, but I believe the underlying reason is that the graphics and bounding box are rounded separately to integer coordinates - and thus lines at the edges are sometimes lost. I remember similar issues in the past.

It seems obvious that the solution is for tools to add some padding in the svg-image (I searched for such an option in LaTeXML in vain); and hopefully the tools will later add that - or possibly the problem is in how web-browsers handle svg (don't round bounding box to nearest - round "up").

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 7 Jul 2015 09:38 UTC Sphinx can be tweaked to pad the bounding box:

With padding: Without:

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 7 Jul 2015 09:52 UTC Replying to [comment:74 sjoelund.se]:

Sphinx can be tweaked to pad the bounding box:

With padding: Without:

The first is good, and considering that LaTeXML also use dvisvgm 1.8.1 (assuming I read the comments in the generated file correctly) I assume they will add a similar option soon; or just have it as default. (LaTeXML already corrected a similar issue for too tight bounding box for Pictures in the latest release.)

I thus don't see this as very relevant for this decision. Obviously we have to consider it when generating html (or generate multiple variants).

However, these images also sometimes lack the fraction-sign in IE (above the second one lacks it - in the preview below the first one). I guess that is just a browser problem.

Update: Just got mail that LaTeXML will add the padding.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 7 Jul 2015 12:38 UTC Replying to [comment:75 hansolsson]:

However, these images also sometimes lack the fraction-sign in IE (above the second one lacks it - in the preview below the first one). I guess that is just a browser problem.

Yes, I would guess it is a browser problem. The SVG's lack fonts (only use paths), and I would have assumed the fraction sign probably was not drawn using a font glyph anyway.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 10 Aug 2015 15:24 UTC _{The following message was held back by the spam gate:}

Matthis Thorade wrote:

I know I come late to this discussion and people already have strong opinions, but I would still like to add my 2 cents (and links):

Wikipedia has a list of lightweight markup languages: https://en.wikipedia.org/wiki/Lightweight_markup_language

In my opinion the source used for generating the docs should be written in one the listed markup languages, i.e. plain text files and not binary files like doc or docx or odt. Of course that text files have to be under version control (my obvious preference: git+github). Then use some tool that produces HTML, PDF, RTF and ePub output from the source.

From what I have heard/seen the following combinations have the largest user base:

rst plus sphinx (possibly in combination with readthedocs)

markdown (the pandoc dialect) plus pandoc

asciidoc plus asciidoctor

commonmark plus e.g. pandoc

commonmark.org has some very strong supporters, including github, stackoverflow and the main pandoc developer jgm. As I understand, commonmark gives a strict specification of the most common elements, pandoc markdown knows a lot more elemts, and rst has the most features. The commonmark spec does e.g. not include tables or academic style citations.

Online markdown editors: http://dillinger.io/ https://github.com/yoavram/markx

Atom.io preview packages: https://atom.io/packages/markdown-preview https://atom.io/packages/markdown-preview-pandoc https://atom.io/packages/rst-preview-pandoc

Other possibly interesting pages: https://en.wikipedia.org/wiki/Comparison_of_documentation_generators https://www.gitbook.com/ http://scholarlymarkdown.com/

modelica-trac-importer commented 6 years ago

Comment by Matthis Thorade on 19 Aug 2015 13:51 UTC Some more minor comments: github handles image diffs in a close to perfect manner: https://help.github.com/articles/rendering-and-diffing-images/ Here is an example diff: https://github.com/thorade/Spoon-Knife/commit/29f958ede10221d87202c30db58410dd76b43399

Atom preview package for asciidoc: https://atom.io/packages/asciidoc-preview https://atom.io/packages/asciidoctor-preview

modelica-trac-importer commented 6 years ago

Comment by otter on 14 Sep 2015 21:15 UTC In my summer vacation I wanted to learn the Go language from Google and used as a concrete project the proposal sketch of comment:67 to get a "feeling" for it. It turned out that this was easier as I anticipated (due to package goquery) and at the end of the vacation it was nearly complete.

You find the Go program makeWebBook here.

You find a Web book that describes all details and serves as a "proof-of-concept" here (note, I concentrated only on the numbering facility and how to handle equations, and just used a very simple layout). Note, all section, figure, table, equation numbers and cross references in this book, as well as all navigation bars and the table-of-contents have been automatically constructed by makeWebBook.

Here is a short sketch what the program does:

It is expected that the book is defined by a set of HTML files and in a configuration.json file the order of these files is defined. The makeWebBook program adds then the usual elements found in scientific reports. In particular, if an element starts with the text "Chapter" or "Appendix" the following action is performed (otherwise the section is not modified; this is useful for a preface or a reference chapter):

,
,
,
elements are updated with section numbers.

elements are updated with table numbers.

elements are updated with figure numbers.

\$\$ ... \$\$
are updated with equation numbers.

.. elements are updated with file name, link text and tool tip if the link points to a position in the book.

If a number is not present, it is introduced (with exception of
element, where a number is only introduced if the text starts with "Chapter" or with "Appendix"). If it is present and correct, nothing is changed. Otherwise, the number is updated.

A navigation bar is introduced in all files with links to the "table of contents" file, the previous, and the next file.

The "table of contents" file is updated with the actual document structure.

Go programs are extremly portable. I only generated an executable for windows that you can directly use (download and run; no installation needed). There are some good ideas in Go that we could also use in Modelica.

modelica-trac-importer commented 6 years ago

Comment by otter on 20 Sep 2015 18:01 UTC Here are some interesting links:

The journal "Simulation Modelling Practice and Theory" has an open access part where articles are present as web-articles and optionally can be downloaded as pdf.

Here is an example that shows that the web-articles have section, figure, table, equation numbering and cross references and that it looks good. For equations they use an interesting concept: By default they are displayed with bitmaps. With a button it is possible to switch to a MathJax representation.

modelica-trac-importer commented 6 years ago

Modified by dietmarw on 2 Dec 2015 10:12 UTC

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 13 Mar 2016 17:27 UTC Dietmar has a simple fix that adds support for writing section number references in Sphinx. (Assuming the spec does not contain any book parts; only chapters)

http://modelica.readthedocs.org/en/numbered/operators.html#expressions shows that the link now shows up as "Section 4.7 Built-in Intrinsic Operators with Function Syntax" instead of "Section Built-in Intrinsic Operators with Function Syntax".

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 13 Jun 2016 10:25 UTC Replying to [comment:80 otter]:

Here are some interesting links:

* The journal "Simulation Modelling Practice and Theory" has an open access part where articles are present as web-articles and optionally can be downloaded as pdf. * Here is an example that shows that the web-articles have section, figure, table, equation numbering and cross references and that it looks good. For equations they use an interesting concept: By default they are displayed with bitmaps. With a button it is possible to switch to a MathJax representation.

The question is how they do it; clearly the source is LaTeX as can be seen from the author instructions: https://www.elsevier.com/journals/simulation-modelling-practice-and-theory/1569-190X/guide-for-authors#25000 https://www.elsevier.com/authors/author-schemas/latex-instructions

But looking at the HTML-source does not reveal their tool-chain (except for optimizer which is irrelevant). They might be using open-source tools, proprietary - or in-house ones (Elsevier is many things, but not poor.) -- I also tried to convert parts to LaTeX and generate pdf and html (attached); while keeping it as close to the current form as possible (as previously indicated there are many things we should change regardless of this - the idea of text-rich tables is not good, and we could gradually switch to using \nameref instead of \ref).

PDF is straightforward. The best option for HTML seemed to be LaTeXML - I noticed that earlier in this thread others had made images for math work, so I didn't bother with those instructions, and thus the math for spatialDistribution is broken.

Note that the goal was to have as little LaTeX-code as possible in the document - and especially not adapt the document to some specific format.

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 22 Jun 2016 17:33 UTC Based on our discussion today, I had a look at if rtfd.org supports latexml, and it seems it does not. It does have the ability to use pip to install python packages and it does have texlive-full installed, but latexml is not part of texlive.

rtfd.org (readthedocs.org) supports integration into git/svn (and plugs into github seamlessly). It also by default uses themes that look rather nice on phones and has a nice way to change specification version (to compare one spec to another), download zip'ed versions of the documentation, pdf version. And it would also allow to link spec.modelica.org to be hosted on their servers. All for free without major effort.

The latexml did look pretty OK, but is of course missing a nice theme and would need some effort to setup all the things you get for free with rtfd.

I also know that Sphinx can be customized rather easily (today's change to make @otter happy was one line; some of the features we felt were lacking since last year are now in the default Sphinx, like SVG images for equations). I am a bit unfamiliar with latexml customizations (and Perl). If it uses xsltproc for its XML to HTML I guess it would be really annoying to make changes and if it is Perl even worse ;)

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 13 Sep 2016 14:27 UTC Examine documents: The Sphinx variant looks good, but somewhat non-intuitive to read&edit source.

One topic discussed was images: One idea is to create the images in Modelica format (they are usually diagrams) - we can then export as svg/png etc. Many of them were created in that way - but the Modelica model is not always available. Having a repository for them regardless of which document format we use.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 19 May 2017 12:54 UTC Have document showing the new proposed documentation formats ready a week before the next design meeting.

Who will do it?

The ones proposing the formats will do it?

Actual conversion of entire document by contract.

Other comments:

Possible formats include: markdown, latex, html.

Want standardized format.

Wysiwyg editors - without extra stuff.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 7 Jun 2017 14:13 UTC What happened to the rst format which is the original proposal? Which including a demo (http://modelica.readthedocs.io) and based on proven and robust technology. The MCP can also be found here: http://modelica.readthedocs.io/en/latest/MCP.html

In addition a poll showed that the majority of participants would be in favour of rst already 2 years ago after which the whole process came to a grinding halt.

modelica-trac-importer commented 6 years ago

Comment by otter on 2 Oct 2017 16:37 UTC Replying to [comment:4 Dietmar Winkler]:

We've now updated the batch files for the IT(department)-challenged people. If you clone/pull https://github.com/sjoelund/MLS-rst-spec there are now two files that you can simply click: * installSphinx.bat will automatically install locally python 2.7.10 (we included the install binary in the repo to make it fail proof for now) and all the necessary sphinx dependencies. * make-html.bat is simply a wrapper for make.bat html which lets you generate the HTML with a simple double-click.

Please let us know if there are still problems. This has now been tested on Windows 7 and Windows XP.

I have made a quick test of Sphinx again (under Windows 7): When using the Anaconda distribution of Python, Sphinx is already in the distribution. The further (simple) steps are described in First Steps with Sphinx:

Open a dos/cmd shell and type "sphinx-quickstart". After answering questions, a basic document structure is generated, including a make.bat.

"make html" generates the html version in the build-directory.

To summarize, if the Anaconda distribution is installed, it seems to be easy to technically use Sphinx on Windows.

modelica-trac-importer commented 6 years ago

Comment by beutlich on 10 Oct 2017 19:28 UTC

More complete LaTeX-test with generated HTML and pdf

I uploaded the pages of attachment:TestLatex.zip here as a quick test. (Needed to fix the img src tags.)

modelica-trac-importer commented 6 years ago

Comment by otter on 17 Oct 2017 21:53 UTC The generated HTML of the LaTeX-test does not look so good. I made some small modifications to the css files. With these changes the book looks much better, see

https://martinotter.github.io/ModelicaSpecificationTestWithLatexML/

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 18 Oct 2017 16:44 UTC Design meeting:

Latex with html as main output: Favor: 2+9 Against: 0 Abstain:1 Assuming links can be changed to stable ones.

Restructured text with html: Favor: 2 Against: 1 Abstain: 9

Henrik: List requirements and give date when it shall be fixed. Martin S: Check if minted works with latexml (not in released version)

(To do for document: wider page and tables. Skip left-css for mobile. No indentation of first line of paragraph.)

Requirements:

Use stable links as links (not chapter numbers)

Sub-sub-sub section numbering

Searchable in a good way - can we add keyword entries for Google search? The index should be about where to find something related to e.g. "input" not everywhere "input" is mentioned

Date: End of November?

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 22 Nov 2017 09:05 UTC Just a side note. It is possible to get rid of the MathJax dependency by running a single script mjpage which will replace all MathJax code with CSS and SVG equivalents. This is successfully implemented in the "Modelica by Example" HTML generator based on Sphinx.

One other aspect and great benefit of Sphinx which seems to have been forgotten is the possibility to

Have an automatic generated rendered version available under https://readthedocs.org/ (e.g., Martin Sjölund's prototype is available under https://modelica.readthedocs.io)

Have additional output-formats available: ebooks, PDFs

If people are reluctant to use restructured text, I wonder if somebody looked at http://leebyron.com/spec-md as suggested by Michael Tiller. The benefit of MarkDown (and reST) over LaTeX is also that these can be rendered directly in GitHub.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 5 Dec 2017 12:33 UTC Replying to [comment:91 Hans Olsson]:

Requirements: * Use stable links as links (not chapter numbers)

There is an open ticket: https://github.com/brucemiller/LaTeXML/issues/895 containing code to solve it. All that is missing is changing it to an UI-option, and some things we don't currently need.

* Sub-sub-sub section numbering

https://github.com/brucemiller/LaTeXML/pull/897 (Note: It was trivial to hack this earlier, and I assume this solution is trivial to people who know perl.)

* Searchable in a good way - can we add keyword entries for Google search? The index should be about where to find something related to e.g. "input" not everywhere "input" is mentioned

Using \index{...} adds meta name=keywords content="..." in HTML (in the correct file), which is allegedly used by Google, https://www.metatags.org/meta_name_keywords

modelica-trac-importer commented 6 years ago

Comment by henrikt on 8 Dec 2017 12:58 UTC I hope that we will also get a nice printed index that allows the reader of the specification to go through all the indexed occurrences of a keyword.

modelica-trac-importer commented 6 years ago

Comment by dietmarw on 13 Feb 2018 16:45 UTC For those who are not feeling comfortable with a side-by-side preview of MarkDown editors, there now exists a very powerful, cross-plattform WYSIWYG editor:

https://typora.io/

modelica-trac-importer commented 6 years ago

Comment by henrikt on 28 Feb 2018 10:28 UTC With less than a month remaining until the next design meeting, what happened with "End of November"?

modelica-trac-importer commented 6 years ago

Comment by sjoelund.se on 28 Feb 2018 10:47 UTC Replying to [comment:91 Hans Olsson]:

Martin S: Check if minted works with latexml (not in released version)

Neither minted (running pygments directly from LaTeX via shell-escape) nor pygmentize works with latexml (pygmentize uses a package called pygtex to simply colour and list files, but latexml does not support it, like so many other LaTeX packages). I tested this with the latest github sources of latexml since the Ubuntu package for latexml is broken since many years back...

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 28 Feb 2018 10:52 UTC Replying to [comment:96 Henrik Tidefelt]:

With less than a month remaining until the next design meeting, what happened with "End of November"?

I will have a bit more time to look at this the following weeks.

However, https://trac.modelica.org/Modelica/ticket/1730?replyto=96#comment:93 contained attempted solutions for the listed issues at start of December (minted was unclear). Following the links from that comment it seems that 2 of the problems are fully solved, but the stable links solution is not yet complete - even if it seemed to work.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 20 Mar 2018 11:15 UTC Plan:

Hans convert Modelica 3.4 before next meeting.

Place on Github: as "modelica/ModelicaLanguage" and possibly rename "modelica/Modelica" to "modelica/ModelicaStandardLibrary"

Check that ok

Continuous integration (webmaster, M. Sjölund)

Convert trac-tickets (T. Beutlich): Leo suggested to link to stable urls on modelica.org that forward to github-tickets.

modelica-trac-importer commented 6 years ago

Comment by hansolsson on 21 Jun 2018 15:43 UTC Source for the above in https://github.com/HansOlsson/ModelicaSpecification

I'm fully aware of some overfull hboxes etc, and some other issues - and I might have missed some more.

Generation of pdf was with pdflatex Generation of HTML was using (and css from Martin Otter): latexml MSL.tex --dest MSL.xml latexmlpost MSL.xml -format html -pmml --splitat=chapter --javascript=LatexML-maybeMathJax.js --navigationtoc=context --css=LaTeXML-navbar-left.css --dest MSL.html

Using LaTeXML from https://github.com/HansOlsson/LaTeXML/tree/UseLabel

The layout should be improved - currently I stuck with the original even if far from good, etc; and there are overfull/underfull hboxes to fix.

The only changes that were deliberate was switching from table in synchronous part, removing section (preparing for smarter one), placement of logo, and #2253. (LaTeXML didn't handle the original broken math.)

Previous Next

© Githubissues.

Githubissues is a development platform for aggregating issues.