mwilliamson / mammoth.js

Convert Word documents (.docx files) to HTML
BSD 2-Clause "Simplified" License
4.95k stars 540 forks source link

Render ordered list type attribute #167

Open schneyer opened 6 years ago

schneyer commented 6 years ago

Hello,

First off, excellent library and very well documented, thank you for making this!!

I've been digging around in the repo and issues list and saw #162 which you rejected. I'd like to re-request that but with some additional info: ordered list type is not merely an aesthetic style; there's a type attribute built into the HTML spec: https://www.w3schools.com/tags/att_ol_type.asp, and the underlying type is contained in the docx zip's numbering.xml, not in a style rule.

Without rendering this in the HTML itself it's impossible to know what type of list to render because it's not explicitly stated in the docx style, so a style mapping won't help.

It seems you are already reading the numbering.xml in the conversion process, so is there any way you could add this attribute to the rendered output?

Thanks! Evan

mwilliamson commented 6 years ago

I'm afraid I don't understand why the type attribute is necessary -- couldn't you use a class with list-style-type instead?

schneyer commented 6 years ago

Yes, either way would work in terms of presentation but I don't know which list style type to give it because the library strips it and I just receive an <ol>. Context is that the user is uploading a docx, so it's important to maintain integrity of the content -- I can't just arbitrarily assign a style of A. B. C. if the source doc is I. II. III. (for instance).

The source list type (decimal, upper/lower roman/alpha) is contained in the numbering.xml in the docx's zip, so really I'm looking for a way to have that information come through in the HTML. The reason I'm suggesting the <ol> type attribute is more to indicate that this is not merely aesthetic styling info but rather part of the content.

VinayPanwar06 commented 3 years ago

Hi Facing the same issue, not getting ordered list type from the word document.

JoshMayberry commented 3 years ago

I am also very interested in this. Perhaps if I was able to do something like this in the style map, that would work?

"p:ordered-list(1)[number-style='a, b, c, ...'] => ol.alphabetic-list > li:fresh",
"p:ordered-list(2)[number-style='a, b, c, ...'] => ul|ol > li > ol.alphabetic-list > li:fresh",
"p:ordered-list(3)[number-style='a, b, c, ...'] => ul|ol > li > ul|ol > li > ol.alphabetic-list > li:fresh",

This uses the Number Style that is assigned to the list type: image

SamtSaber commented 2 years ago

yes, adding this feature would really help

mwfrost commented 4 months ago

I also have a use case where <ol> types in the output HTML should be consistent with the Word styling.