sile-typesetter / sile

The SILE Typesetter — Simon’s Improved Layout Engine
https://sile-typesetter.org
MIT License
1.66k stars 98 forks source link

pandoc writer for SILE #413

Closed severak closed 7 months ago

severak commented 7 years ago

Hi, is there pandoc custom writer for SILE?

I have workflow where source document is markdown and its converted via pandoc to EPUB, HTML and TeX (for PDF output).

However TeX is somewhat complicated system with too many traps on way, so I am investigating better way how to generate PDF.

simoncozens commented 7 years ago

@alerque has one!

alerque commented 7 years ago

Hey @severak, yes I have something like that. I've been picking away at it as a branch to the Pandoc source that I hope to one day submit upstream. Holding it back right not is that it's not really feature complete. I started with the LaTeX writer as a template and started swapping out bits so that it produced SILE code. I've covered a lot of the basic data types: paragraphs, headings, lists, inline styles, blockquotes, footnotes, etc. but there are still some I haven't tackled simply because my own book publishing operation hasn't run into them yet. In the event of hitting a table, image, or a few other possible data types and certain types of nesting, it still chokes (or outputs LaTeX code that makes SILE choke). Maybe having somebody else using it would be the push I need to go through the rest of the data types.

What does your input data look like? What platform are you running? Have you ever compiled Pandoc or are you using a prepackaged build? If the latter, what build system? I'll look into getting you setup.

severak commented 7 years ago

Hi @alerque

I have never compiled Pandoc from source, everytime I used a prepackaged one. Now I am running on Raspberry Pi (Raspbian distro).

I tried to wrote pandoc writer in lua (there is a possibilty to do so) but I never use it in real world.

You can freely test with my text from my publishing project because its opensource. However, all the text is in Czech. See https://github.com/svita-cz/knihy

simoncozens commented 7 years ago

You also try getting SILE to read in Markdown natively. Update to the latest HEAD, and try this:

\begin[papersize=a4,class=markdown]{document}
\begin{obeylines}
\include[src=pro-kukacku.md]
\end{obeylines}
\end{document}
simoncozens commented 7 years ago

(Obviously the output is not going to be perfect yet. File bugs and I'll fix them!)

alerque commented 7 years ago

@severak, besides the pull request I sent you a minute ago, as can be seen in this experimental branch I also started trying to use @simoncozens's way of rendering Markdown with SILE. Unfortunately something seems to be wrong with the Markdown parser. I think something is cross-wired with the assorted Lua UTF-8 libraries, but I'm not exactly sure.

Simon any idea why I would be getting told this?

Error detected: /usr/local/share/sile/lua-libraries/lunamark/reader/markdown.lua:463: attempt to call a nil value (field 'lower')

P.S. I'm also tinkering with the Pandoc → SILE → PDF route in this branch. I see I need to fix a a couple things in the Pandoc Writer though.

simoncozens commented 7 years ago

No idea! I don't have a reference to "lower" anywhere near that line in my copy of it. (Did you git pull? I just rebased the markdown processor with upstream.) In my fork, I've declared the utf8 module to be the One True Way of handling UTF8, and the upstream version now tries several different modules; if something is pulling in a different library it may be causing trouble.

alerque commented 7 years ago

I just rebased the markdown processor with upstream.

Oh hello, missed that. I was using a week old SILE build.

alerque commented 7 years ago

@severak I got your document rendering in SILE using both methods: SILE's build in Markdown parser and my Pandoc SILE writer. Here is a sample:

  1. Pandoc → SILE
  2. SILE
  3. Pandoc → ConTeXt

selection_308

Both ways are adjustable. The SILE build in Markdown reader doesn't handle the line breaking stuff right (need to file an issue) but that's likely not a hard fix. I did more work on styling the Pandoc route, but that was just a matter of the time I had, the exact same thing could be done to the other one to get rid of the section numbering, etc. The Pandoc one also has a working TOC, but again that could be added to the other route.

In all honesty the Pandoc route is more powerful, but it's also a lot more complected (at least until my Writer is contributed upstream) and if simple is what you're after probably fixing up the build in Markdown reader is what you're after.

I didn't build the title page yet, but Pandoc does pass variables that can be used to build it from the meta data. If going the built in route SILE can read the YAML file to get meta data itself (I have code for that if you want it, but it's pretty simple).

severak commented 7 years ago

Nice to see my poems as sort of Lorem ipsum for SILE. Also I have prose text in my other book (but that it's work of (really slow) progress.

Omikhleia commented 2 years ago

Just asking, has this path been considered further https://pandoc.org/custom-writers.html ? @severak mentioned it above (it seems, but the Pandoc doc has changed), but we went other directions (our own buggy markdown class with lunamark, and some experimental branch in Pandoc itself, if I understand correctly).

I just discovered it and it seems to me it would have some advantages:

Not sure yet what the actual disadvantages would be and why this hasn't been pushed forward.

alerque commented 2 years ago

@Omikhleia The tooling for bidirectional support via Lua did not exist as an option when I started down the Haskell writer path. There are several advantages/disadvantages to each approach. I ran in to some reason I wanted to keep going with the native Haskell writer variant at least for my own use, but I've honestly forgotten what the issue was now.

By the way the "own buggy markdown class with lunamark" is completely unrelated to the Pandoc fork that has a native SILE writer. The latter uses the pandoc package but not the markdown class. I do not use the Markdown class for anything at all as it is too buggy/incomplete. The Pandoc writer in my fork writes directly to SILE format and the pandoc package provides the extra functions for it's particular output choices (especially making things like arbitrary inline and block classes usable).

It's possible the Pandoc Lua writer is worth exploring again, but personally I think having a native reader/writer would be more robust, especially as a pathway for direct Markdown→PDF conversion in the Pandoc that everybody has as opposed to needing extra tooling.

severak commented 2 years ago

Got notified so I would like to describe my current setup:

That short stories book got it's own repo. It's still generated from markdown source code using pandoc, but it's output goes to ODT (LibreOffice Writer native format), which is later manually tweaked in Writer (mainly images are added - I cannot figure how to do it automatically) and converted to PDF.

I am not proud of it but result is usable and it's way more comfortable for me to edit text this way.

Omikhleia commented 2 years ago

It's possible the Pandoc Lua writer is worth exploring again, but personally I think having a native reader/writer would be more robust.

I agree on the conclusion, although the question is "when". In less than 15-20mn trying the Pandoc Lua writer, in an admittedly real quick'n dirty way, I got almost everything I needed but tables, definition lists and math. It's true that I had now all needed packages behind me (enumerations, bullets, quotations, figures with captions, and even styles for direct hook on custom-styles extensions, etc.). Math, I don't care for now. Definition lists, easy later (and not that much needed for my book interests, but anyway, there's no real complexity here). Tables, I was just lazy in that quick attempt, but I have them, so it's doable (albeit in a bit more than 15-20mn).. And those 15mn or so were more satisfying than my attempt at #1336:

image

Could be a short term solution, but it works.

Could give us ideas how to do it properly (e.g. mimic Pandoc's Lua API, so the code made for it would be usable directly when we get our own native processor).

Omikhleia commented 2 years ago

BTW Thanks @severak for your feedback above on your final setup. So if I got it right, you gave up on the SILE path and used LibreOffice eventually. I can understand. I do hope you'll have a chance to try SILE again one day, though ;)

Omikhleia commented 2 years ago

Following the promising "Pandoc custom writer in Lua" route I had started experimenting a few months ago, here is at last a document which shows the point I was able to reach...

... basically, supporting near to everything Markdown (the good stuff starts at §3, skip my ramblings before that at convenience...)

IMHO, it's a decent PoC. It is in a dedicated branch in my personal repo for now (It might be rebased and merged at some later point, but I'd have to clean up several quick "hacks" first).

severak commented 2 years ago

@Omikhleia I have to say that output looks great.

Omikhleia commented 7 months ago

Since #1616 was rejected almost 2 years ago...

Let's go back to the initial issue:

However TeX is somewhat complicated system with too many traps on way, so I am investigating better way how to generate PDF.

This is exactly what markdown.sile achieves, without the need for a Pandoc writer (whether "custom" or not).

It doesn't mean that some people shouldn't push for some Pandoc writer to exist[^1], but that's a separate project outside of SILE core. So it's a "Not My Bug", in many ways -- and all the more as unaddressed in visible ways since 2016.

[^1]: Anyone, virtually, can work on Pandoc itself, or implement a Pandoc "custom" writer (possibly using the more recent new filter-like API from Pandoc rather than the older (now) deprecated API discussed above).