jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.66k stars 3.38k forks source link

Parsing custom styles in docbook #3657

Open marcban opened 7 years ago

marcban commented 7 years ago

Docbook has a way of specifying custom styles with the role attribute in the phrase element, but Pandoc does'nt parse them.

So Pandoc is able to generates custom styles in docx, odt, html... formats, but only from markdown or html <span> element

from the code of the docbook parser, I can imagine that it would take just one line or two to generate a Span element with custom-style attribute, but I'm not an Haskell developer...

I think this would also solve #1235.

mb21 commented 7 years ago

see also the issue of extending custom styles to ODT/ICML: #2106...

jgm commented 7 years ago

The code changes would indeed be simple. It's a matter of deciding on bigger architectural issues. Do we assume that role attributes should be treated as style names in docx and odt output? What problems might that cause? How should they be handled in other output formats? What attribute keyword should be used? (role, data-role, custom-style?)

Another option would be treat role as a class. This wouldn't have an automatic interpretation in docx or odt, but at least it would be possible for a filter to do something with it.

marcban commented 7 years ago

Some answers :

Ideally, the parser would also add a custom-style on specific docbook elements like guibutton, filename so that it would be easy to style them in the output...

marcban commented 7 years ago

a precision : my docbook files are generated from asciidoc python tool, which already generate phases with role for quoted text, see http://www.methods.co.nz/asciidoc/userguide.html#X51.

thomas-ferchau commented 1 month ago

I also miss this.

jgm commented 1 month ago

The role on phrase is parsed as a class, currently:

% pandoc -f docbook -t native
<simpara role="green">green</simpara>
<simpara><phrase role="green">another green</phrase></simpara>
<simpara><phrase role="red">red</phrase></simpara>
<simpara><emphasis role="marked">marked</emphasis></simpara>
[ Para [ Str "green" ]
, Para
    [ Span
        ( "" , [ "green" ] , [] )
        [ Str "another" , Space , Str "green" ]
    ]
, Para [ Span ( "" , [ "red" ] , [] ) [ Str "red" ] ]
, Para [ Emph [ Str "marked" ] ]
]

You could use a Lua filter that converts classes on Span into custom-style attributes.

function Span(el)
  if el.classes[1] then
    el.attributes['custom-style'] = el.classes[1]1
  end
end

(untested)

The role on simpara currently doesn't do anything; there is no slot for attributes in a pandoc Para element.