mate-desktop / atril

A document viewer for MATE
http://www.mate-desktop.org
GNU General Public License v2.0
197 stars 62 forks source link

Some text in PDFs cut off at the top #610

Closed mirabilos closed 2 months ago

mirabilos commented 2 months ago

Sorry for leaving the template, but this is more a request for help for debugging, as I’m slightly convinced that this might be a bug in Mu͒seScore and/or Qt5 (possibly in addition to one in atril and printers or PDF to printer language converters which may use the same underlying components).

Beginning situation: I have a musical score file, in which the top of the treble clefs 𝄞 is cut off. As the last person to touch the font used to draw these notation signs, I got a bugreport about this. I had no idea, as it worked for me… in mupdf and okular.

At the same time, I noticed that, when my gf prints sheet music for me, some of the copyright and hyperlink text got cut off as well. I blamed the paper feeder for this initially.

Recently, someone found that the treble clef cutting-off happens only in some softwares, atril among them. So I installed it and, voilà, I can reproduce locally the effect with both the clef and the font I use for the copyright info and hyperlinks, but not the normal text/info and lyrics font, and only in atril, and not when I use e.g. LibreOffice with that font to make a PDF:

screenshot of test PDF in `atril`

In this example, we see the clef cut off (left side, about middle) as well as the top of two text blocks cut off. The upper one is the font I normally used for this, OTF/CFF, unhinted. I put a small border with no padding around it to debug even further. The lower one is the font converted to TTF and autohinted or unhinted (doesn’t matter), and with metrics slightly adjusted following this howto as the reporter said they could work around this by playing with the metrics in the notational font. (Changing the text font to debug this is easier as the notational font is baked into the software.)

Funnily enough, even though I slightly increased line spacing in the lower font, Okular shows Mu͒seScore shrank the box surrounding the text ever so slightly (which is why I suspect a bug or two there):

Screenshot in Okular

I’ve looked at this with qpdf in QDF format and with pdf.js’ debugging modes, and cannot find whiteouts or anything that explain where in the PDF the issue is (hoping I could backtrack from there), so it probably is an issue with the way atril renders the embedded subset fonts and others don’t (which is where we get to the printer thing, though my gf needs yet to print the test page for me as the printer is in the office). PDF attached (in PKZIP container)

Help to track down the cutting-off issue welcome, as I’m at the end of not just the road but even the dirt track I paved following it further here. Meanwhile (at a more normal time of day) I’ll try to hunt down the rendering code Mu͒seScore uses here, though I’m fairly certain it’s just a QPainter or something.

MATE general version

Not running MATE at all, just evilwm.

Package version

ii  atril          1.24.0-1     amd64        MATE document viewer

Linux Distribution

Debian bullseye/amd64

Link to bugreport of your Distribution (requirement)

None, since this is a debugging request more than a bugreport, so a request for upstream to help. (Maybe there is also a bug in atril or the libraries it uses, maybe not.)

cwendling commented 2 months ago

I don't know much of anything about this but:

More interestingly: try opening it in Inkscape choosing the Poppler import. You'll see the same clipping issue, but you'll also be able to see that the G clef shape is inside a clip area. If you extract the clef from the clip, it renders fine.

Steps in Inkscape:

So… I have no idea what is the root cause, is it the PDF actually generated with incorrect clipping (and only some rendering libraries honor it properly), or is Poppler/Cairo miscalculating the clip area? Or does the font have incorrect extents or clipping or something that Poppler/Cairo try to enforce while some other just let it "bleed out"? I guess you'll have to figure this part out :slightly_smiling_face:

HTH

mirabilos commented 2 months ago

Thanks, this helps a lot!

Seeing at whether this is clipped somehow was the reason I used qpdf to look at it in QDF form, but my PDF-fu is not good enough to detect it there. I hadn’t seen that Inkscape also shows the clipping.

And indeed. Right-click, “Release Clip”, and painting the clip red shows:

Inkscape screenshot part

Or does the font have incorrect extents or clipping or something

Definitely not, after looking at it in exhaustive detail in FontForge, the FontForge SFD file format and a ttx dump of the TTF.

This confirms my rough guess that Mu͒seScore and/or Qt5 have a bug somewhere there that clips the texts drawn. Now I’ll “just” have to find that… (to both fix that and, maybe, find a way to hack the fonts to work around it, for older versions) wish me luck ;-)

Now at least I have a tool with which I can inspect and measure the clipping.

I’m closing the issue here, as we good as confirmed that the bug is exclusively in the PDF producer.

cwendling commented 2 months ago

This confirms my rough guess that Mu͒seScore and/or Qt5 have a bug somewhere there that clips the texts drawn. Now I’ll “just” have to find that… (to both fix that and, maybe, find a way to hack the fonts to work around it, for older versions) wish me luck ;-)

"just" indeed :slightly_smiling_face: Good luck :wink:

I’m closing the issue here, as we good as confirmed that the bug is exclusively in the PDF producer.

This, or the PDF renderer is another possible option (albeit less likely I'd think). At least the fact that not all renderers give the same result suggests the clipping it either not handled by some, or mishandled my others.

mirabilos commented 1 month ago

It’s a Qt bug.

I reported it to the Debian packagers as https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1070406 because I don’t have the means to deal with Qt upstream and hope they’ll forward it.

mirabilos commented 1 month ago

I could actually use some PDF-side debugging help with this tiny reproducer PDF from the C++ MWE in that article.

The entire page content (expanded by qpdf) is:

stream
/GSa gs /CSp cs /CSp CS
0.059999999 0 0 -0.059999999 10.0199999 831.980000 cm
q q
Q
Q q
q
q
1 0 0 1 0 0 cm
/CSp cs 0 0 0 scn
/GSa gs
BT
/F7 1200 Tf 1 0 0 -1 0 0 Tm
100 -1000 Td <0001> Tj
600 0 Td <0002> Tj
600 0 Td <0003> Tj
600 0 Td <0004> Tj
600 0 Td <0005> Tj
600 0 Td <0006> Tj
600 0 Td <0007> Tj
600 0 Td <0008> Tj
600 0 Td <0004> Tj
600 0 Td <0009> Tj
600 0 Td <000a> Tj
600 0 Td <000b> Tj
ET
Q
Q Q
endstream

Where in that is the clipping?

I have now found that this is generated by src/gui/painting/qpdf.cpp in Qt5, in the QPdfEnginePrivate::drawTextItem method, which basically spits out PDF magic directly:

    *currentPage << "BT\n"
                 << "/F" << font->object_id << size << "Tf "
                 << stretch << (synthesized & QFontEngine::SynthesizedItalic
                                ? "0 .3 -1 0 0 Tm\n"
                                : "0 0 -1 0 0 Tm\n");
[…]
        *currentPage << x - last_x << last_y - y << "Td <"
                     << QPdf::toHex((ushort)g, buf) << "> Tj\n";
[…]

I don’t know the PDF format well enough to proceed from here without burning another couple of days. (I do know the general command format, the subsetting and the addressing of its glyphs by number, <0004> is space here for example, and 600 is probably the advance width, it’s a monospaced font.)

mirabilos commented 1 month ago

OK, now this is interesting:

15 0 obj
<<
  /Ascent 835
  /CapHeight 835
  /Descent -177
  /Flags 4
  /FontBBox [
    0
    -177
    509
    835
  ]
  /FontFile2 16 0 R
  /FontName /QMAAAA+Inconsolatazi4varl_qu-Regular
  /ItalicAngle 0
  /StemV 50
  /Type /FontDescriptor
>>
endobj

Opening object 16 (the embedded subset font) in FontForge, I get totally different metrics than the original font!

\ OTF embedded scaled
Em Size 1000 2048 2048
Ascent 800 1638 1638.4
Descent 200 410 409.6
Win Asc 835 same 1710
Win Desc 177 same 362.5
Typo Asc 800 same s.o.
Typo Desc -200 same s.o.
Typo Gap 90 same 184.3
HHead Asc 835 same s.o.
HHead Desc -177 same s.o.
HHead Gap 90 same s.o.
Cap Height 623 same 1275.9
X Height 457 same 935.9
t top X 267 547 546.8
t top Y 592 1212 1212.4

The entire font has been scaled from 1000 ppem (PostScript and OTF/CFF default) to 2048 ppem (suitable for TrueType/t42), which, while questionable (we’re outputting to PDF, which is PostScript on drugs, after all) not incompetently done, but the OS/2 metrics have not been scaled, and the /FontBBox also bases on the metrics of the orginal font file.

And the cutoff I’m seeing is roughly at the height of 1000 (or 1024?) above the baseline.

mirabilos commented 1 month ago

According to https://bugreports.qt.io/browse/QTBUG-586 Qt “cannot” embed OTF/CFF fonts. It apparently does so, but not correctly.

The (missing half the metrics) scaling to 2048 is done for TTF fonts as well, though.

And when I quickly scale the TTF version of the font to 2048, it works in Atril.

sigh…

But okay, so I don’t think there’s a bug on the viewer side, just an inconsistency in the face of bad input (GIGO).

mirabilos commented 1 month ago

And to finish the Atril part of the discovery, it’s the entry in the hhea table that Atril uses.