jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.61k stars 3.38k forks source link

ICML paragraph styles ending with "> Paragraph" #8333

Open ptram opened 2 years ago

ptram commented 2 years ago

Hi,

I see that paragraph styles are converted with a "> Paragraph" extension added to the original paragraph style name.

Is there a reason for this redundant information? I find it a bit of an annoyance, since it will break the stylesheet, and make reading the name in the style palette more difficult.

Is there a way to avoid this extension to be added, maybe with a command passed to Pandoc together with the other attributes?

Paolo

iandol commented 2 years ago

Here is an example:

▶︎ pandoc -t icml
::: {custom-style="Poetry"}
| A Bird came down the Walk---
| He did not know I saw---
:::

<ParagraphStyleRange AppliedParagraphStyle="ParagraphStyle/Poetry &gt; Paragraph">
  <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
    <Content>A Bird came down the Walk—</Content>
  </CharacterStyleRange>
  <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
    <Content>&#x2028;</Content>
  </CharacterStyleRange>
  <CharacterStyleRange AppliedCharacterStyle="$ID/NormalCharacterStyle">
    <Content>He did not know I saw—</Content>
  </CharacterStyleRange>
</ParagraphStyleRange>
ptram commented 2 years ago

Here is an example:

Exactly. Is there a way to avoid the &gt; Paragraph part to be automatically added to the name of the paragraph style?

Paolo

iandol commented 2 years ago

Not that I know of, I think the line that does this is this one:

https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Writers/ICML.hs#L527

But I don't know any haskell and can't grok why it appends > Paragraph.

As a quick workaround you can use a text editor or script and remove this fragment before importing into InDesign.

jgm commented 2 years ago

I don't know a thing about ICML. @mb21 wrote the ICML writer, if I recall, and might be able to explain why this is here.

mb21 commented 2 years ago

If you want to differentiate between normal paragraphs, and paragrphas inside your Poetry block, you need two different style names. That's why we generate the name "Poetry > Paragraph". Try deeper nesting things (e.g. inside lists or blockquotes) and you see how it works.

see also https://github.com/jgm/pandoc/wiki/Importing-Markdown-in-InDesign

ptram commented 2 years ago

If you want to differentiate between normal paragraphs, and paragrphas inside your Poetry block, you need two different style names.

I understand. My objection is that this is not always needed (to be true: it is very rarely needed), and can be done at the origin. Automatically changing the original style names scrambles the stylesheet, and makes the use of Pandoc as an interchange format more complicate.

Take for example this case: You are conforming to a company or publishing house's style. You write/edit with those styles. When converting to ICML, you get the paragraph style names changed.

At this point, you have to manually edit all the converted ICML files and all the styles, to avoid breaking compatibility with the house style.

Paolo

ptram commented 2 years ago

By the way – sorry if this should be clear to most, but I'm new to the conversion filters and I'm not a coder. Is there a way, on my side, to use a customized version of the "Text.Pandoc.Writers.ICML" module, and try some changes? Are there instructions on how to install it in my Mac (Mojave)?

Paolo

jgm commented 2 years ago

https://pandoc.org/installing.html#compiling-from-source

ptram commented 2 years ago

https://pandoc.org/installing.html#compiling-from-source

Thank you, John. Let's see if I can dirty my fingers with the compilers, without cutting them!

Paolo

mb21 commented 2 years ago

Automatically changing the original style names scrambles the stylesheet

Not sure I understand your workflow. Your input format is markdown? Can you post an example?

Your problem is specific to the custom-style attribute?

Try the following example, and you'll see why we need to generate style names and you'll have to adjust your company's template if you want your input to be markdown anyway:

## title

my para

> my blockquote

::: {custom-style="Aside"}
## Title in the aside

my para in the aside

> my blockquote in the aside
:::
ptram commented 2 years ago

Try the following example

I get this result:

image

What is not working, here, for me, is that the system decides the style names, taking the place of the author. A stylesheet is always pre-existing, inherited either from the publishing house or previous projects from which you are reusing some parts.

A stylesheet always includes style names. If the converter pretends to create different ones, you end up with something that is in practice impossible to edit, due to the high number of styles to be manually renamed in the various files of a project.

This is a snippet from document made in Scrivener, from an original in InDesign, and converted to Pandoc markdown:

::: {custom-style="Heading 3"}
Heading 3 paragraph
:::

::: {custom-style="Image"}
{image.tiff}
:::

In the original document, I have the "Heading 3" and "Image" paragraph styles applied to these paragraphs.

The Pandoc-->icml converter changes the names into "Heading 3 > Paragraph" and "Image > Paragraph". They no longer match the stylesheet. Everything has to be applied manually line by line, without being able to let InDesign match the incoming text with the template's styles.

Paolo

mb21 commented 2 years ago

What is not working, here, for me, is that the system decides the style names

I see. Then you need to either post-process the output (with something like sed), or use a different tool.

Wrapping some thing in divs like you did in your example (the ::: syntax) adds an additional element to the pandoc document AST (you can see this better with -t html or -t native), it's not renaming the style of the element its wrapping. That's because it's intended for things like aside boxes or similar, that can contain arbitrary content.

ptram commented 2 years ago

Wrapping some thing in divs like you did in your example (the ::: syntax) adds an additional element to the pandoc document AST (you can see this better with -t html or -t native), it's not renaming the style of the element its wrapping. That's because it's intended for things like aside boxes or similar, that can contain arbitrary content.

For what I can read in the other thread you linked above (containing a detailed explanation of how creating custom styles work), it seems to me that fencing is mandatory for creating new paragraph styles.

Is there an alternative way to create them, without considering them included into containers like boxes or asides?

Post-processing the generated ICML file is a possibility, but also one passage more, that would slow the whole process and open a possibility for errors. It would be great if it wouldn't be needed, and the converter could also simply generated a code with the same styles named as in the Pandoc file.

Paolo

mb21 commented 2 years ago

fencing is mandatory for creating new paragraph styles.

that's correct. because currently not all pandoc AST elements support custom attributes.