asciidoc-py / asciidoc-py2

Deprecated python2 implementation of AsciiDoc.py. See asciidoc-py/asciidoc-py for current work.
https://asciidoc.org/
GNU General Public License v2.0
466 stars 128 forks source link

Keep-together #108

Closed pepa65 closed 7 years ago

pepa65 commented 7 years ago

Feature request: Some way to keep certain elements/paragraphs on the same page when generating pdf. It could be by supporting the unbreakable option on open blocks, or some other way. This would be awesome!

elextr commented 7 years ago

http://asciidoc.org/userguide.html#X74

pepa65 commented 7 years ago

Does that table say that unbreakable is supported on all block elements for the docbook backend?? (If so, a comma is missing after block in column 3. But I would be really happy!)

elextr commented 7 years ago

Sorry, its not missing a comma, but not because of Asciidoc, those are the only elements keep-together applies to according to Docbook XSL: The complete guide.

pepa65 commented 7 years ago

Hmm... Maybe I should explain what I'm trying to achieve. I'm trying to make a pdf of songs, but I'd like to keep every song on the same page, even if there are more verses. What I have now:

=== 2 ข้ายินดีเมื่อเขากล่าวแก่ข้าว่า
....
[1] ข้ายินดี เมื่อเขากล่าวแก่ข้าว่า [x3]
ให้เราไป ยังนิเวศ พระเจ้าเถิด
[2] ร้องบทเพลง สาธุการแด่พระเจ้า  [x3]
ให้เราไป ยังนิเวศ พระเจ้าเถิด
....

=== 3 ฮาเลลูยาสรรเสริญ
....
ฮาเลลูยาสรรเสริญ ฮาเลลูยาถวายสาธุการ
ขอถวายเกียรติแด่พระเจ้า ผู้ทรงอัศจรรย์
และบริสุทธิ์ ทรงประทานพระบุตรองค์เดียว
มาเป็นผู้ไถ่ ฮาเลลูยาสรรเสริญ ฮาเลลู
ถวายสาธุการ ฮาเลลูยาขอยกย่องพระนาม
พระสิริ คำสรรเสริญ แด่จอมราชา
....

=== 4 มานี่เป็นเวลานมัสการ
....
มา นี่เป็นเวลานมัสการ
มา นี่เป็นเวลาถวายดวงใจ
มา นมัสการดั่งที่เป็น
มา ต่อหน้าพระองค์ดั่งที่เราเป็น มา

สักวันทุกลิ้นจะยอมรับว่าทรงเป็นพระเจ้า
สักวันทุกคนจะก้มกราบลง
แต่ทรัพย์สมบัติล้ำค่าคงเป็นของบรรดา
ผู้เลือกและรับพระองค์ [x2]

มา นี่เป็นเวลานมัสการ
มา นี่เป็นเวลาถวายดวงใจ
มา นมัสการดั่งที่เป็น
มา ต่อหน้าพระองค์ดั่งที่เราเป็น มา มา มา
....

The sections in the literal blocks I'd like to keep together when making a pdf.

elextr commented 7 years ago

You are applying [unbreakable] to open blocks, and as noted above, unbreakable is not supported on those elements because the Docbook toolchain simply does not process the resulting <?dbfo keep-together=always> on those elements. So unbreakable is ignored on open blocks.

Asciidoc produces Docbook, which is a content description language, not a presentation description language. So Asciidoc is a content description language and has limited presentation control, because Docbook has limited presentation control.

All you really have at your disposal in normal text is hard page breaks, which may help you to start each entry on a new page, but it won't stop overlong entries continuing on the next page.

Instead of pure text you cold try abusing tables to control layout more closely, they accept [unbreakable].

pepa65 commented 7 years ago

Thanks for your help and explanations. I considered using tables. None of the songs is too long for one page, but sometimes more can be fit on. I probably should go on to Latex or some such, but I prefer less complex solutions.

elextr commented 7 years ago

Yes, Latex is a more presentation oriented markup, and as a result is more complex.

pepa65 commented 7 years ago

If I feed the above fragment to a2x it doesn't even work for some reason. If I replace the Thai with latin character it does.

elextr commented 7 years ago

Docbook requires all enclosing levels to exist in the document.

elextr commented 7 years ago

PS which PDF toolchain are you using?

pepa65 commented 7 years ago

I tried it without unbreakable. It seems docbook doesn't even support unicode. I get `openjade: non SGML' character just into the second song (of 637...)

elextr commented 7 years ago

Again which docbook toolchain are you using?

pepa65 commented 7 years ago

Standard Ubuntu 16.04 installation.

elextr commented 7 years ago

No, what docbook toolchain, Docbook is a markup standard, its not an implementation.

There are a number of implementations that convert Docbook to PDF. The a2x script supports dblatex that does the conversion via latex and Fop that uses XSL-FO to add formatting then converts it to PDF.

Neither of those uses Openjade to my knowledge, and they do support Unicode.

pepa65 commented 7 years ago

I did:

asciidoc -b docbook example.asciidoc
docbook2pdf example.xml

I got errors when trying a2x (a2x example.asciidoc). On a2x: executing: "dblatex" -t pdf -p "/etc/asciidoc/dblatex/asciidoc-dblatex.xsl" -s "/etc/asciidoc/dblatex/asciidoc-dblatex.sty" -V "/home/pp/example.xml" I first get oodles of Missing character. Then a number of the xsltproc seems to go OK, and then pdflatex failed, some Missing \endcsname inserted. followed by Package inputenc Error: Keyboard character used is undefined and more missing characters inserted, and then Unexpected error occurred and a2x exits status 1.

I guess it's not really an asciidoc issue, more what happens after... But a2x definitely doesn't work on Thai text.

Oh, and a2x --fop example.xml doesn't give any errors, but all Thai script is replaced by # symbols in the resulting pdf.

elextr commented 7 years ago

Well, thats a fault in the particular converter docbook2pdf, not in docbook itself. Docbook is XML, so its Unicode. You could try one of the other ones I mentioned.

See http://www.sagehill.net/docbookxsl/FOprocessors.html

pepa65 commented 7 years ago

You mentioned a2x, I tried that with and without --fop, see above.

elextr commented 7 years ago

I took the text of your example, added the missing level sections and ran it through a2x -f pdf --fop with no problems. Of course I don't have the requisite fonts for your characters, and it gives errors about that and substitutes # as you found, but it still renders the page with no complaints about the characters (and shows them in the error messages). For how to configure fonts see the sagehill book I linked above.

elextr commented 7 years ago

Asciidoc is somewhat like Docbook, its a standard for a markup language, but also has an original implementation. However if you are starting out using Asciidoc markup you might want to look at the newer implementation Asciidoctor which has its own PDF converter in development.

Unsurprisingly the same thing happens with the asciidoctor-pdf toolchain, your characters are replaced by spaces since I don't have the fonts, and it says in its documentation "Asciidoctor has no challenge working with Unicode. In fact, it prefers Unicode and considers the whole range. However, once you convert to PDF, you have to meet the font requirements of PDF in order to preserve Unicode characters. There’s nothing Asciidoctor can do to convince PDF to work without the right fonts in play." A sentiment I think applies to all converters.

At least the asciidoctor-pdf theming guide shows how to add fonts.

pepa65 commented 7 years ago

I guess I was facing a font issue then... Thanks a lot for the help. I'm currently looking into SILE as an "easier Latex".

elextr commented 7 years ago

You will probably need the fonts for latex too :)

pepa65 commented 7 years ago

In SILE one can just specify any installed font.