jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.88k stars 3.39k forks source link

Lists in Word conversions should use conventional styles and indents #7280

Open arencambre opened 3 years ago

arencambre commented 3 years ago

This relates to #7275 because I would like lists to match the native styles of Word and of HTML. I am opening as a separate issue because resolution could be different than #7275.

Suppose you have this Markdown unordered list:

* Is this indented?
    * Proper vertical space here?
    * How about here?
* Back to first level.
    * Vertical spacing again. Issue?
    * How does this look?

When pandoc converts to a Word document, I see these deviations from conventions of both Word and in HTML (as viewed in browsers):

These are just the differences I see in what I have recently done. A more exhaustive review may find more deltas.

The additional vertical spacing corresponds to a configuration difference in a pandoc-generated Word doc vs. native Word list styling: When selecting a list item in its entirety in Word, then clicking on the arrow on the bottom right of the Paragraph section in the Home ribbon, I see that the Don't add space between paragraphs of the same style box is unchecked in the pandoc-generated document. If you opened a blank, new document in Word 365 and created an identical list by hand, that checkbox would be checked. Checking that box in the pandoc-generated document eliminates the undesired extra vertical spacing.

Some solutions (meant as ideas, not as an exhaustive list):

Pindar777 commented 2 years ago

@arencambre I would like to add the following observation: For pandoc > 2.10.1 − since pandoc-citeproc.exe is no longer used − changing the layout of the default list-style in a referenced DOCX-template has no effect on the generated docx-document (at least when knitted via GNU R rmd-file). That is why I keep using the version from 15.09.2020.

Hint: The change in the word-template needs to be down as "a manual parameter input" after "right click on the list-level". Just using the lineal or changing the list-style do not work.

asknet commented 8 months ago

Is there any known workaround to customize list/ bullet styles? Thanks!

jgm commented 8 months ago

This is an old issue, so I wanted to check to see if it's still valid. Here is what I see when I use pandoc to convert (changing the font to the one Word now uses by default):

Screenshot 2024-04-03 at 9 13 33 AM

And here is what I see when I open a blank document in Word and type in the same:

image

There's no significant difference in indentation or line spacing that I can see. There is a difference in the choice of bullets, which we could change, but I don't know how much that matters.

asknet commented 7 months ago

Apologies my earlier comment wasn't clear. I'm trying to figure out if the bullet styles are customizable via word reference template. My lists are indented using - bullet which is not the preferred style in my environment. I couldn't find a way to change the bullet style to something else

Thanks 🙏🏽

jgm commented 7 months ago

@asknet - not currently. But your question is only tangentially related to this issue. I would still like clarification from @arencambre on the question I ask above.

arencambre commented 7 months ago

Suppose I create this QMD document:

---
title: "Untitled"
format: docx
---

## Test

* Is this indented?
    * Proper vertical space here?
    * How about here?
* Back to first level.
    * Vertical spacing again. Issue?
    * How does this look?

If I render it, I get this Word document: image

Compare that to if I open a new Word document and type in the same and do not modify styles: image

Microsoft's defaults are sensible. Pandoc's departure from them feel arbitrary.

jgm commented 7 months ago

Currently we create our own abstractNum elements in numbering.xml, 990 for plain lists and 991 for bulleted lists. Instead, perhaps we could use abstractNum 0 and 1 for these and just take those from the numbering.xml in the reference.docx. That would allow them to be set to Word defaults and customized. It could be that there is some problem in principle that prevented me from doing it this way in the first place, but I will experiment.

jgm commented 7 months ago

OK, I see that there isn't much consistency to the numbers Word uses for abstractNum. IF I create a document with a bullet list and a plain list, abstractNum = 0 is the plain list, but if the document has just a bullet list, it's the bullet list. IF the document begins with an ordered list, abstractNum = 0 is the ordered list. So we can't rely on that. But I could just manually reproduce Word's default for our current abstractNum = 991 (which we use for bullet lists), instead of taking this from the reference.docx.

roberto-sebastiano commented 7 months ago

Upvote here. I'm trying to style a list (either numbered or bulleted) from Markdown to Docx.

Any update ?

jgm commented 7 months ago

Well, all the updates there are can be found right above your comments. These changes make the style pandoc emits match Word's defaults. They do not allow you to customize bullets etc.

vstiebe commented 3 months ago

I found this issue because I'm looking to prefix the resulting numbered list in docx. The use case is draft contract agreements generation from markdown.

CLAUSE 1st CLAUSE 2nd CLAUSE 3rd CLAUSE 4th and so on.