jgm / pandoc

Universal markup converter
https://pandoc.org
Other
34.59k stars 3.38k forks source link

Support additional PDF engines #6126

Open Crissov opened 4 years ago

Crissov commented 4 years ago

Following up on #3906 and #3909, should other XML/HTML+CSS-to-PDF converters also be supported? I know of Vivliostyle, PDF Reactor and Antenna House Formatter, but there may be more out there.

alerque commented 4 years ago

Not quite an identical situation but in the same genre I think, SILE also supports XML as an input format. I'm working on native Pandoc handlers for it using it's other input syntax (see #6087 discussion and #6088 for the Writer PR), but it can also be invoked as an XML→PDF rendering engine. As one of the authors of SILE I am undoubtedly heavily biased, but since the CSS in your equation above usually isn't part of document anyway I don't see any reason why this should be limited to HTML/XML+CSS→PDF, the same general arrangement could be used as HTML/XML+Lua→PDF. All we'd have to do is setup a class design for it (largely overlap with by other work for support for Pandoc generated documents anyway) in much the same way somebody has to inject a CSS file for the other converters.

Crissov commented 4 years ago

Sure, XML+XSL(T) processors also fall into this category, but I know next to nothing about them.

jgm commented 4 years ago

In principle we could do this but I'd want to see a compelling reason: what can you do with these that you can't do with the engines we already support?

Crissov commented 4 years ago

My thinking was that existing users of these products could then easily add Pandoc to their tool chain.

brainchild0 commented 4 years ago

@Crissov: Can you offer some details of a use case? Ultimately integration of any HTML to PDF converter into Pandoc seems like a convenience. Any existing user of such a product seems to be someone who has a source for HTML documents, which could be Pandoc, but ultimately the distinction is irrelevant. The value of Pandoc would be to create an HTML representation of a document, if doing so is part of a user's pipeline.

tarleb commented 3 years ago

I'd also like to mention paged.js, which is fully open source and looks quite promising.

There's also https://print-css.rocks/tools, which has a (highly opinionated) overview listing HTML-to-PDF engines.

tarleb commented 3 years ago

As pointed out by @Delanii in #6540, there is also arara. It's an alternative to latexmk.