jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.67k stars 3.38k forks source link

OpenDocument.hs: When using well-known inlines styles, it should emit well-known style names (and NOT automatic styles) #7336

Open kjambunathan opened 3 years ago

kjambunathan commented 3 years ago

OpenDocument.hs: When using well-known inlines styles, it should emit well-known style names (and NOT automatic styles)

I am using org-citeproc[1] that uses writeOpenDocument in standalone mode to produce the output.

And the problem is ... in standalone mode only the odt xml body is emitted, but not the styles. Not emitting the automatic style definitions is OK, as they aren't part of body, but with NO access to style information, a consumer who wants to use that XML fragment is at a loss on what style names like T1 etc stand for. It would be better if instead of 'T1', the OpenDocument emits well-known style names like Emphasis, Strong etc, or rather PandocEmphasis, PandocStrong etc. [2]

A cursory look at the OpenDocument.hs shows that even the style definitions of T1, T2, T3 are NOT stable, in the sense that T1 can mean an emphasis style when exporting one document, but can mean strong in another document i.e., The order of naming depends on the order in which a span style is seen during the export.)

<text:p text:style-name="Text_20_body">Brandom, Robert. 1994.
<text:span text:style-name="T1">Making It Explicit
</text:span>. Harvard University Press.
</text:p>
<text:p text:style-name="Text_20_body">Hofweber, Thomas. 2007. “Innocent Statements and Their Metaphysically Loaded Counterparts.”
<text:span text:style-name="T1">Philosophers’ Imprint
</text:span> 7 (1).
</text:p>
<text:p text:style-name="Text_20_body">Russell, Bertrand. 2001. “Descriptions.” In
<text:span text:style-name="T1">The Philosophy of Language
</text:span>, edited by A. P. Martinich, Fourth, 221–27. Oxford University Press.
</text:p>

Note the use of T1.

Here is a sample output from the HTML writer.

<div id="refs" class="references">
  <div id="ref-Brandom1994">
    <p>Brandom, Robert. 1994.
      <em>Making It Explicit
      </em>. Harvard University Press.
    </p>
  </div>
  <div id="ref-Hofweber2007">
    <p>Hofweber, Thomas. 2007. “Innocent Statements and Their Metaphysically Loaded Counterparts.”
      <em>Philosophers’ Imprint
      </em> 7 (1).
    </p>
  </div>
  <div id="ref-Russell1919">
    <p>Russell, Bertrand. 2001. “Descriptions.” In
      <em>The Philosophy of Language
      </em>, edited by A. P. Martinich, Fourth, 221–27. Oxford University Press.
    </p>
  </div>
</div>

[1] https://github.com/kjambunathan/org-citeproc.

This is a very old org-citeproc by Richard Lawrence which i have modified to compile against recent monadic changes.

[2] Emacs' Orgmode ODT exporter---of which I am the sole author--uses well-known styles, instead of automatic styles for bold, italic styles etc.

jgm commented 3 years ago

The OpenDocument writer is a very old part of pandoc. I've never really understood why automatic styles were used (though there probably is some explanation), but the original author of that module isn't in touch any more. Unfortunately I don't know a lot about opendocument format, so I'm not inclined to touch the writer myself, but this would be a great project for someone who does know opendocument.