PhilterPaper / Perl-PDF-Builder

Extended version of the popular PDF::API2 Perl-based PDF library for creating, reading, and modifying PDF documents
https://www.catskilltech.com/FreeSW/product/PDF%2DBuilder/title/PDF%3A%3ABuilder/freeSW_full
Other
6 stars 7 forks source link

Underlining issues #192

Open PhilterPaper opened 1 year ago

PhilterPaper commented 1 year ago

Two things to be done regarding underlining with column() (markup):

  1. See underline.pl (attached).

There seems to be some slight line-to-line variation in the leading (baseline-to-baseline distance) that varies by how much text there is in a line. The font vertical extents query should be for the overall font, and not for any particular words, so I don't think it has to do with descenders. A puzzling anomaly.

  1. See underline.html (attached).

All the browsers I tried do a good job with interrupting an underline to not overwrite a descender (i.e., they leave sufficient horizontal gap(s) to clear the text). I don't see any way to do this automatically in PDF, but maybe I just haven't found it yet. Currently both underlines (in text() and in column() simply draw a line from one side of the word to the other, colliding with descenders. The line thickness and spacing below the baseline are proportional to the font size.

I suppose I could include information in Font Manager to handle all characters with descenders (including Q) with where the underline needs to be interrupted. It would be nice to do this automatically, but I don't know where ink has been applied when writing a glyph (also handle any swashes, etc. in the font). The only thing I can think of at the moment is to outline the glyph in white (background color) before writing the glyph itself, although there is the danger of interfering with adjacent characters.

Strike-throughs (line-throughs) and possibly overlines would not need to worry about this. underline.pl.txt underline.html.txt

PhilterPaper commented 1 year ago

Something that would be simpler would be to simply entirely skip any character that is known to have a descender (from a fixed list). The biggest problem with that is there are more than just g, j, p, q, y, and Q -- there are lots of accented letters in the Latin alphabet, as well as various ligatures. That's not even thinking about non-Latin alphabets! Any given font could include other glyphs that don't necessarily have Unicode points (but show up due to HarfBuzz::Shaper use). Perhaps HarfBuzz includes some sort of information about descenders in a font?

Anyway, much thought needs to go into this, as every font will have a different set of glyphs, and one font design will vary from another (e.g., where the descender is). Unfortunately, PDF::Builder has no access to exactly where on the paper ink has been placed -- the Reader's rendering engine handles that (presumably also used for browsers). Also, some Readers may support "Rich Text" with HTML-like markup, but apparently not all Readers do. Anyone with knowledge in this area (availability of descender information, or built-in underlining capability) is welcome to chime in.

PhilterPaper commented 1 year ago

One possibility would be to first draw the underline, draw the text in the background color (in both the fill and a thick outline to blot out part of the underline), and finally draw the text in the normal foreground color. This ought to leave gaps around descender, with a minor risk of interfering with other characters or images. I'm not sure, though, that this will work with a transparent background (the default), such as over an image.

Is there any way to draw the glyph outlines for the text (to be underlined) and capture the strokes to find possible intersections with an underline? Then it would be reasonably trivial to put gaps in the underline itself. It's possible that filled areas (of glyph strokes) will be missed, but since the underline is normally the same color as the text, that may not be a problem, especially if the underline stroke is drawn first. I don't see a problem with an underline showing up within a closed bowl (e.g., the bottom loop of a "g" in some fonts). Exceptionally thick underline strokes may need some beveling to avoid collisions with slanted stroke outlines.

An overline might be handled in a similar manner (avoid intersecting with a character stroke), but there's probably no reason to worry about a line-through (strikethrough) -- the whole idea is to collide with the glyphs.

PhilterPaper commented 1 year ago

Another approach might be to use the glyph outline as a clipping port and somehow "thicken" it to prevent a collision between the glyph descender and the underline (clip away the underline stroke, drawn after the text). That might not be any improvement over drawing in the background color thickened strokes, then drawing the glyph normally (the underline was drawn first). However, the clipping approach would not risk one glyph's encroachment on another glyph, or giving poor quality if the background is anything but a solid color (actually, it's usually transparent, which can't be drawn over an already-drawn underline.

PhilterPaper commented 9 months ago

Investigate

  1. Annotation = Underline
  2. rich text CSS text-decoration: underline
  3. TextDecorationType = Underline (also Overline, LineThrough) in Inline Structure Element (Tagged Content for Accessibility)

Rather than trying to draw a graphical underline over/under text. Of course, we may have to settle for fixed line weight, same color as the text, and same spacing.

PhilterPaper commented 6 months ago

Another thing to consider would be to first draw the underline (either in graphics or escaped text object) in a somewhat lighter shade than the text color (add some white to the text color), and then just write the normal text over it. It should then still be visible as an underline without merging with the glyphs and making them hard to read. Similar things could be done if its not a white (transparent) background, just to get some contrast. Might need to thicken the underline a little to make it about as "gray" as the text, while still seeing the text as "over" the underline.

Regarding line-throughs (strike-throughs), the idea is to obliterate the text so that it is not easily readable. Nevertheless, consideration could be made to make a line-through just slightly lighter, to be more consistent with an underline. Same thing for overline, which may intercept a few letter ascenders and accent marks, which you don't want obliterated (so probably make the same color as an underline). Perhaps all three lines could be the same color and thickness?

For non-white/transparent backgrounds, perhaps the color could be 10 to 20% "of the way" from background to foreground color. For a transparent background, calculate using white. At the moment, column() doesn't give background color selection, but eventually it will. An alternative would be to add CSS to set the underline color*, possibly as an override to the default selection. For other non-column calls, you can select the underline color.

* see several posts in #215 concerning existing CSS to set underline properties. This includes squiggly lines and color. Note that annotation styles (see previous post) do NOT avoid descender collision -- see examples/040_annotation output. Therefore it is likely that underlines/overlines will have to be drawn as graphics anyway.