wkhtmltopdf / wkhtmltopdf

Convert HTML to PDF using Webkit (QtWebKit)
https://wkhtmltopdf.org
GNU Lesser General Public License v3.0
13.96k stars 1.82k forks source link

Any tab (\t) will be in the PDF file as a space #4542

Open flavianstef opened 4 years ago

flavianstef commented 4 years ago

Can you help regarding wkhtmltopdf, I`m trying to generate a PDF from HTML template that contains TAB character (\t).

My HTML had some tabs that I wanted to be preserver, so I tried to use < pre > to wrap everything and using "white-space: pre-wrap" but all tabs became spaces in the PDF file.

I tried to change them to to see if that would change anything, but it didn't.

PhilterPaper commented 4 years ago

How are these tabs treated in a browser (HTML display)? What's supposed to happen in a browser is that, outside of any <pre> area, all runs of one or more whitespaces (space, tab, newline, etc.) are supposed to be collapsed to a single space (x20). Within a <pre>, results will vary, as there is no overall definition for where tab stops should be. In other words, a tab could become any number of spaces. wkHTMLtoPDF uses the WebKit engine to do most of its formatting, so I would expect it to treat tabs like a typical browser would.

flavianstef commented 4 years ago

Thank you for the answer but wkHTMLtoPDF doesn`t treat tabs like a typical browser would.

Here is the HTML output into Chrome (test HTML source code: https://codepen.io/flavians/pen/JjjgKRm): image

Here is the wkHTMLtoPDF output (PDF) content: image

PhilterPaper commented 4 years ago

I can see that Firefox, Chrome, and Edge all appear to expand tabs to 8-column tab stops when using <pre>, and a single space (or possibly nothing) without <pre>. It would appear that each tab is replaced by a single space in wkHTMLtoPDF (with <pre>), but I can't tell you if something is done to process the input stream before WebKit gets it, or if WebKit is guilty of deciding to treat tabs this way. Someone more familiar with the innards of wkHTMLtoPDF will have to speak to this issue.

I seem to recall another discussion within the last 5 or 6 months regarding possibly odd behavior of <pre> in wkHTMLtoPDF, in how strings of whitespace are collapsed.

What are you trying to accomplish when using tabs? If it's to align columns of data, perhaps you'd be better off going to table cells (with no rules between cells)? Tabs are really a leftover from the days of monospaced typewriters, and are an ugly wart in good typesetting.

ashkulz commented 4 years ago

I think #2728 was the earliest issue that I could find, it was due to the CSS using monospace and no explicit font being specified. Can you try with that?