johnfactotum / foliate

Read e-books in style
https://johnfactotum.github.io/foliate/
GNU General Public License v3.0
5.25k stars 254 forks source link

Foliate should not separate quotation marks from text across line breaks when rendering EPUB files #1330

Closed ahelwer closed 1 month ago

ahelwer commented 1 month ago

Describe the bug Given an EPUB file with text containing quotation marks (ex: ', ‘, ’, ", “, ”), Foliate should not separate these quotation marks from the text they encapsulate when inserting line breaks. This should also remain true when the encapsulated text is a <math>...</math> element.

To Reproduce Steps to reproduce the behavior:

  1. Add the following line to the body of a .xhtml content file in a minimal EPUB3 archive:
    <html xmlns=... xmlns:m="http://www.w3.org/1998/Math/MathML">
    ...
    <body>
    ...
    When we are confronted with ‘<m:math><m:mi>p</m:mi><m:mtext>&#160;or&#160;</m:mtext><m:mi>q</m:mi></m:math>’ by itself, we do not in general know which interpretation to assign to it.
    Often the choice is immaterial, in that either sense would serve equally.
    For example, consider the expression ‘<m:math><m:mi>x</m:mi><m:mo>≦</m:mo><m:mi>y</m:mi></m:math>’, i.e., ‘<m:math><m:mi>x</m:mi><m:mo>&#60;</m:mo><m:mi>y</m:mi><m:mtext>&#160;or&#160;</m:mtext><m:mi>x</m:mi><m:mo>=</m:mo><m:mi>y</m:mi></m:math>’.
    ...
    </body>
    </html>
  2. Open the EPUB file with Foliate
  3. Play around with zoom and such until one of the single quotes surrounding the equations are split onto a separate line from the equation itself, with the quote as the last character on that line.

Expected behavior I expect Foliate to ensure that quotes stay with the text they encapsulate and are not split across line breaks.

Screenshots IMG_3485

Version:

ahelwer commented 1 month ago

I will attempt to reproduce this on the latest version of Foliate.

ahelwer commented 1 month ago

I got version 3.1.1 from the nix unstable channel and confirmed this issue still occurs on it.

johnfactotum commented 1 month ago

This seems to be not our bug and not specific to Foliate. I can reproduce this in a plain HTML file in Firefox, WebKitGTK, and Chromium.

ahelwer commented 1 month ago

Well, shoot. Seems so far upstream that it will be years before it is fixed. Thanks for checking!

ahelwer commented 1 month ago

Anyway, solved by placing the quotes inside a <mtext> element inside the <math> element.

ahelwer commented 1 month ago

@johnfactotum if I did want to get the ball rolling on upstream fixes, what would be the right bug database to use?

johnfactotum commented 1 month ago

Not sure. You can try searching or reporting the issue to the individual browser engines, like https://bugzilla.mozilla.org/, https://bugs.webkit.org/, https://issues.chromium.org/issues. It's possible that this might be a spec issue, too, in which case you could discuss this in W3C's issue trackers for MathML.

In particular, MathML 3/4 has the following:

an inline math element should be treated as inline (typically exactly as if it were a sequence of words in normal text). In particular, this applies to spacing and linebreaking: for instance, there should not be spaces or line breaks inserted between inline math and any immediately following punctuation.

From https://www.w3.org/TR/MathML3/chapter2.html and https://w3c.github.io/mathml/. But I'm not sure if this is part of MathML Core, which seems to be what browsers are implementing today.