py-pdf / fpdf2

Simple PDF generation for Python
https://py-pdf.github.io/fpdf2/
GNU Lesser General Public License v3.0
1.08k stars 247 forks source link

Solution needed: Typographical Margins, Line Height, Leading #1219

Open gmischler opened 3 months ago

gmischler commented 3 months ago

Background

In typography, any vertical distances relating to running text are normally given as a factor relative to the font size. I don't remember ever encountering any layout software that didn't handle it that way. This quite obviously makes sense, since the font size can vary at any time, and the line height should just adapt to that.

Unfortunately, the authors of the original PHP FPDF weren't familiar with typographical conventions. That is why we still have the h parameter to cell() and multi_cell(), which expects an explicit height in document units. Since it is not possible to change the font size within those, that didn't really become apparent as a problem until now.

Note that the same requirement doesn't necessarily apply to horizontal distances, such as left and right margins or indents. Those are largely independent from the font size (except in extreme cases).

Current situation

When creating the text regions, I just did the natural thing and defined line_height as a factor relative to the font size (default: 1). This is exactly what HTML does, so there is no conversion needed from there.

Unfortunately, when tables were created, their line_height was defined in absolute document units. I didn't notice the significance of that at the time either. This will change in #1200, and adopt the text region semantics. It has to change - breaking backwards compatibility - because there is no other way to adapt the line height to the font size, since the user can't know in advance which size text will end up on which line.

Some time ago, TitleStyle() was created. This has since been changed into TextStyle() in #1211 to make it more generally useful. But since this class was originally used only in combination with multi_cell(), it has inherited the bad habit of using absolute values for t_margin and b_margin. And now that we're trying to use it in a broader context, that becomes a serious problem.

In the html.py module, various vertical margins used to be defined relative to the font size. Since both HTML and the text regions work that way, this is really the only thing that makes sense there. But now we tried to start using TextStyle() there in #1217, we ended up with a clash of the different concepts. The previous adaptable vertical margins have now turned into fixed sizes. The desired flexibility in formatting has thus pretty much turned into the opposite. Whenever a user changes the font size in a style, they must also change the vertical margins, or they'll get unexpected results. This can already be observed in some of the test files.

Unfortunately, TextStyle() also does not allow to leave the margins undefined (eg. None), which would mean "use system settings". But _new_paragraph() in html.py will add a top margin of the font size whenever the top margin given is "not true", because that is the default for HTML block elements. This makes it impossible to use a t_margin of 0 with write_html(). Another clash of concepts that isn't trivial to resolve.

And lastly, my own code uses document units for the top and bottom margins of Paragraph()s. This simplifies the code a bit, because we can apply the margins without looking at the text lines, but in context of this discussion maybe it wasn't the optimal decision either.

Conclusion

I don't think we should publish a next release while the system is in this confusing state. Since TitleStyle() has some legacy, we should probably re-establish it as it was, but TextStyle() will need some changes to really become as universally useful as it should be. Its vertical margins need to be defined relative to the font size, and there must be a way to set them to "undefined" (= "use system settings").

Other than that, I also don't have a complete concept ready to solve all the aspects mentioned here. The general approach should be to use relative vertical sizes wherever possible in a typographical context. How that manifests itself in various situations, and how we manage any necessary transitions is up for discussion.

P.S.: I have tried to announce and explain any plans of my own for significant changes to the internal workings in advance, as a precaution if I had overlooked some unintended consequences. Since the library has grown into quite some complexity, maybe we should make this an informal policy.

Lucas-C commented 3 months ago

Unfortunately, when tables were created, their line_height was defined in absolute document units. [...] It has to change - breaking backwards compatibility - because there is no other way to adapt the line height to the font size, since the user can't know in advance which size text will end up on which line.

I do not fully understand why breaking backward compatibility is mandatory there...

While I understand and approve the "rule of thumb" that vertical spacing must be proportional to font size, breaking backward compatibility is kind of a big deal and IMHO requires thorough consideration.

Your argument is that "the user can't know in advance which size text will end up on which line". So far I never had an issue with that. Could you provide some code snippet that demonstrate the problem at hand @gmischler, please?

In the html.py module, various vertical margins used to be defined relative to the font size. [...] The previous adaptable vertical margins have now turned into fixed sizes. The desired flexibility in formatting has thus pretty much turned into the opposite. Whenever a user changes the font size in a style, they must also change the vertical margins, or they'll get unexpected results.

Regarding the fpdf.html module, I think we should take care to follow to the HTML specification and current rendering in browsers as close as possible.

Considering the following code snippet:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="utf-8">
    <meta name="viewport" content="width=device-width,initial-scale=1">
    <title>Testing margins with headings - fpdf2 issue 1219</title>
    <style>
    h2 { border: 1px solid red; }
    p  { border: 1px solid blue; }
    </style>
  </head>
  <body>
    <!-- Default h2 font-size is 1.5em = 24px -->
    <h2>Heading 1</h2>
    <p>Content paragraph 1</p>
    <h2 style="font-size: 2em">Heading 2</h2>
    <p>Content paragraph 2</p>
  </body>
</html>

Viewing this quick test page in modern browser reveals that:

_new_paragraph() in html.py will add a top margin of the font size whenever the top margin given is "not true", because that is the default for HTML block elements. This makes it impossible to use a t_margin of 0 with write_html(). Another clash of concepts that isn't trivial to resolve.

Unless I'm mistaken, this is due to the if not top_margin and not self.follows_heading: condition in _new_paragraph(), that I wanted to get rid of in PR https://github.com/py-pdf/fpdf2/pull/1217.

I agree that it would be a good idea to differentiate between t_margin set to None in tag_styles and a t_margin set to 0. Making this difference could be a first, short PR toward improving the situation.

I have tried to announce and explain any plans of my own for significant changes to the internal workings in advance, as a precaution if I had overlooked some unintended consequences. Since the library has grown into quite some complexity, maybe we should make this an informal policy.

I think you are right, and I will try to follow this policy as much as I can in the future.

I used lightweight architecture decision records at work in the past, and they are a good way to keep track of the reasons of the decisions made over years. It's basically the same as RFCs: Request for Comments.

But the format you adopted in https://github.com/py-pdf/fpdf2/discussions/339 @gmischler is perfectly fine to me.

Lucas-C commented 1 month ago

I don't think we should publish a next release while the system is in this confusing state. Since TitleStyle() has some legacy, we should probably re-establish it as it was, but TextStyle() will need some changes to really become as universally useful as it should be. Its vertical margins need to be defined relative to the font size, and there must be a way to set them to "undefined" (= "use system settings").

Do you still think that this issue should prevent a new release @gmischler?

I'm willing to take some time re-establish TitleStyle and try to have relative margins in TextStyle, if you can answer my questions above & are willing to review the final PR 🙂