jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.8k stars 3.39k forks source link

"First paragraph" style in ICML export #3457

Open p3palazzo opened 7 years ago

p3palazzo commented 7 years ago

The first paragraph style seems to have gone missing in ICML export. It renders fine in docx/odf export but is not applied when exporting to ICML, and is a very convenient thing to have for nice typography.

mb21 commented 7 years ago

I see what you mean:

$ echo -e '#foo\n\nfoo' | pandoc -t opendocument
<text:h text:style-name="Heading_20_1" text:outline-level="1">foo</text:h>
<text:p text:style-name="First_20_paragraph">foo</text:p>

$ echo -e '#foo\n\nfoo' | pandoc -t icml
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Header1">
  <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
    <Content>foo</Content>
  </CharacterStyleRange>
</ParagraphStyleRange>
<Br />
<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Paragraph">
  <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
    <Content>foo</Content>
  </CharacterStyleRange>
</ParagraphStyleRange>

The writers handle such things individually. The ODT/OpenDocument writer e.g. has a dedicated setFirstPara function. We should probably do something similar in the ICML writer.

Then again, in ICML things are a bit more complicated, since ideally we would have a lot more first and last styles, not only on paragraphs, see https://github.com/jgm/pandoc/issues/2315

rafkot commented 3 years ago

A relatively straightforward workaround I am using to add the first paragraph style to ICML output is:

1) Create a temporary copy of my markdown source (so the markdown source which I edit by hand remains readable)

2) Run a regexp on it (I am using Perl one-liner) to add custom style using nested div syntax after headings (and wherever else it's needed), so:

Content of the first paragraph

becomes:

::: {custom-style="FirstParagraph"}
Content of the first paragraph
:::

3) Use Pandoc to convert that new temporary file to ICML.

This produces the "FirstParagraph" style when imported to InDesign. I do all of this in bash script which also applies other ICML-specific transformations as it's simpler conceptually for me than writing a Pandoc filter.

mb21 commented 3 years ago

@rafkot thanks for posting! probably you should be able to do the same transformation by writing a pandoc filter as well...

rafkot commented 3 years ago

@mb21 I realize this could be achieved by a filter but I haven't gotten my head around how to achieve the same result with a filter (or create any writer filter at all) so running regular expression on the markdown file seemed like the easiest way to achieve this (which I can actually understand). So I shared it here just in case someone needs to do a simple transformation like this without getting into the complexity of the filters.

Also, writer filter might be challenging to write right now, as I discovered that ICML writer is more complex than it actually needs to be: it creates redundant "CharacterStyleRange" where there are quotes (works the same for single and double quotes) around a bit of text (i.e. the generated output will have an extra CharacterStyleRange that does not change anything around the opening and closing quotes and a bit of text between them). I was meaning to open an issue about this and if/when this is changed any filter depending on the 'old writer' might simply fail to perform any transformation at all.

rafkot commented 3 years ago

Also, I noticed that the CustomStyle > FirstParagraph style is created for the first paragraph in a "div" in ICML when a custom style is added with a fenced div syntax around several paragraphs, like:

::: {custom-style="SomeCustomStyle"}

Content of the first paragraph

Content of the second paragraph

Last one

:::

which creates SomeCustomStyle > FirstParagraph in ICML file.