Closed bitsgalore closed 10 years ago
Note: getting tables right is tricky this way. See below HTML, which is what I got after exporting from MS Word following some cleaning up by Tidy:
<table class="MsoNormalTable c7" border="1" cellspacing="0" cellpadding="0">
<tr>
<td width="229" valign="top" class='c1'>
<p class="Tablecellheading"><span lang="EN-GB" xml:lang="EN-GB">Test
name</span></p>
</td>
<td width="310" valign="top" class='c2'>
<p class="Tablecellheading"><span lang="EN-GB" xml:lang="EN-GB">True
if</span></p>
</td>
</tr>
<tr>
<td width="229" valign="top" class='c3'>
<p class="Tablecell"><span lang="EN-GB" xml:lang=
"EN-GB">boxLengthIsValid</span></p>
</td>
<td width="310" valign="top" class='c4'>
<p class="Tablecell"><span lang="EN-GB" xml:lang="EN-GB">Size of box contents
equals 4 bytes</span></p>
</td>
</tr>
<tr>
<td width="229" valign="top" class='c5'>
<p class="Tablecell"><span lang="EN-GB" xml:lang=
"EN-GB">signatureIsValid</span></p>
</td>
<td width="310" valign="top" class='c6'>
<p class="Tablecell"><span lang="EN-GB" xml:lang="EN-GB">Signature equals
0x0d0a870a</span></p>
</td>
</tr>
</table>
Pandoc does not convert it to a nicely formatted Markdown table. After some experimentation I could make the above example work after the following steps:
This produces something like this:
<table class="MsoNormalTable c7" border="1" cellspacing="0" cellpadding="0">
<thead>
<tr>
<th width="229" valign="top" class='c1'>Test name</th>
<th width="310" valign="top" class='c2'>True if</th>
</tr>
</thead>
<tbody>
<tr>
<td width="229" valign="top" class='c3'>boxLengthIsValid</td>
<td width="310" valign="top" class='c4'>Size of box contents equals 4 bytes</td>
</tr>
<tr>
<td width="229" valign="top" class='c5'>signatureIsValid</td>
<td width="310" valign="top" class='c6'>Signature equals 0x0d0a870a</td>
</tr>
</tbody>
</table>
Throwing this at Pandoc produces:
|Test name|True if|
|:--------|:------|
|boxLengthIsValid|Size of box contents equals 4 bytes|
|signatureIsValid|Signature equals 0x0d0a870a|
Which will render as:
Test name | True if |
---|---|
boxLengthIsValid | Size of box contents equals 4 bytes |
signatureIsValid | Signature equals 0x0d0a870a |
So the trick here will be to automate the above changes throughout the document.
All done, see:
https://github.com/openplanets/jpylyzer/tree/master/doc
This is now used to produce an online version of the documentation:
http://openplanets.github.io/jpylyzer/userManual.html
Export to delivery formats other than HTML needs more work ...
The source document of the User Manual is currently an MS Word document, which is a bit awkward to edit/maintain. Apart from that it creates a dependency on proprietary software (LibreOffice / OpenOffice will mess up the layout). Some of the figures were originally created in MS Powerpoint, has similar problems.
An alternative would be to migrate the User Manual to Markdown Extra (which includes table support), and to provide the figures as SVG. This would lower the barrier to contributing to the User Manual, and it would simplify things as well. In combination with a tool like Pandoc it would also enable us to generate versions of the User Manual in pretty much any desired delivery format (HTML, PDF, EPUB, etc.).
How to do this?
Possible workflow:
Convert HTML to Markdown. In Pandoc:
pandoc -f html -t markdown_phpextra umanual.html > umanual.md