vincentdoerig / latex-css

LaTeX.css is a CSS library that makes your website look like a LaTeX document
https://latex.vercel.app
MIT License
2.74k stars 125 forks source link

Add printable design using Paged.js #62

Open vihuna opened 10 months ago

vihuna commented 10 months ago

Add "printable" design using Paged.js (WIP)

vercel[bot] commented 10 months ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
latex-css ✅ Ready (Inspect) Visit Preview 💬 Add feedback Apr 7, 2024 9:30pm
vihuna commented 10 months ago

Guidelines (I'm following)

Some drawbacks

Conflicts/Errors

(in the process of loading a document using Paged.js.)

It's very easy, if you are not careful, to get random rendering issues when you load\reload the paginated document. In order to work properly, Paged.js must be the last Script that manipulates DOM and CSSOM to load.

So you must be careful that all your scripts that modify the DOM and CSSOM have finalized before Paged.js starts to process the document. From Paged.js documentation:

"As soon as your browser has loaded everything your HTML needs to be shown on screen (including images, font files, etc.), the script will start paginating the content and pages will appear on your screen."

So the solution for scripts is to load them before Paged.js; but there are also some other problematic elements: those that can be lazy loaded.

In conclusion, I think it's recommended not to use web-fonts and do not lazy-load images.

Some tasks to consider

Other considerations

PR Tasks

(updated: wrong PDF attachment)

vihuna commented 10 months ago

About line-height calculations (to fix the error)

I made a mistake with the value of the line height (I forget to transform \linespread value to HTML).

To explain the upcoming change (so that there is no doubt), I'm going to comment how line height is determined both in LaTeX and HTML (as far as I understand).

In LaTeX, the line height is mainly determined by \baselineskip (among other variables, it's not so simple). E.g., by default in 10pt LaTeX article, \baselineskip = 12.

Approximately, the [defined-by-me]-ratio \baselinescriptfactor := (\baselineskip) / (\fontsize) is always around 1.2.

So "approximately^2" the method how line height is determined in LaTeX is:

LineHeight = (\baselineskipfactor) * (\linespread) * (\fontsize)

So,

(I usually use 1.3 and 1.6 resp)

And the more natural way how line height is determined in HTML is (supposed line-height is used with a unit-less value):

LineHeight = line-height * font-size

So, assuming that the same units are used, equating both expressions for LineHeight, the following relation can be established:

(\baselineskip) * (\linespread) = line-height

So, obviously (it's actually the same as above):

vihuna commented 10 months ago

Finally, this is my proposal for the ToC (with page numbers). This is how it looks (left:Firefox, right:Brave, line-height still unfixed): ToC-firefox-brave

The ToC must be manually written as for LaTeX.css without pagination.

I have tried several options (the nested lists have been a headache), but I think this is the best compromise between "LaTeX resemblance" and "javascript usage".

It has been "necessary" to add (using javascript) a container for the dotted-line, but the rest of the original ToC HTML code remains unchanged.

Main Disavantages:

ToC layout differences

9.9  |Toc Entry: Lorem ipsum dolor

     |9.9.1   |ToC Entry: Lorem ipsum dolor
      [...]
     |9.9.9   |ToC Entry: Lorem ipsum dolor
     |9.9.10  |ToC Entry: Lorem ipsum dolor sit amet, consectetur
              |adipiscing elit
9.9  |Toc Entry: Lorem ipsum dolor

    |9.9.1  |ToC Entry: Lorem ipsum dolor
     [...]
    |9.9.9  |ToC Entry: Lorem ipsum dolor
    |9.9.10  |ToC Entry: Lorem ipsum dolor sit amet, consectetur
    |adipiscing elit
9.9  |Toc Entry: Lorem ipsum dolor

    |9.9.1  |ToC Entry: Lorem ipsum dolor
     [...]
    |9.9.9  |ToC Entry: Lorem ipsum dolor
    |9.9.10  |ToC Entry: Lorem ipsum dolor sit amet, consectetur
             |adipiscing elit

I think the only way to automatically mimic LaTeX style (if desired) is by using more javascript for size calculations (maybe it could be manually achieved with a set of custom CSS classes).

vihuna commented 9 months ago

List issues

As it has been commented before, for ToC design I have tried to keep balance between "javascript code used" and "LaTeX features achieved".

I would like to show some errors that may happen to lists while the document is been split (most of them for Chrome-based browsers). I thought this behavior was expected, and it was supposed you must use break-* CSS rules, but there is an open issue in Paged.js repo with a more detailed explanation.

Some more of this situations can be added: it not only occurs with marker, but also may happen with before or after pseudo-elements; and with figures and tables too. Some images (Firefox on left, Brave on the right): firefox-brave ToC-error-1

firefox-brave-ToC-error-2

table-caption-break

I will add later some classes for break-* CSS rules, no only for break-inside: avoid like in the linked issue, but also break-before and break-after, to deal with these breaks. Focusing in the ToC layout, its clear we need also the two last properties, because of the nested lists. This classes must be added manually when one of this breaks occurs. I have tested them and it seems they work properly; also Paged.js documentation says these three properties are supported.

Other possibility it would have been to use javascript to process all lists, transform them into one-level nesting lists, while the nesting level is represented through CSS classes. So a simpler list is obtained, with a more easy CSS customization (I suppose). It also be easier to use break-inside automatically, and to avoid break-before/break-after.

One more challenging error while splitting, occurs with default marker lists and only in Firefox: it doesn't respect start list attribute if counter-reset property (automatically added by Paged.js) is used at the same time.

firefox-brave-lists-error

For the moment, this particular Firefox error must also be manually fixed by the user: you must add value="<counter>" [at least] to the first li element after the page break, and <counter> must be the position of the element inside the list.

vincentdoerig commented 9 months ago

Thank you so much for this PR @vihuna!

I've had a look at your code and comments and I'm impressed with the work you've done here. Here are my thoughts:

Going forward, I think that adjusting overflows would be the most important thing to do. Especially the scrolling code blocks require some text-wrapping. I would also force light mode by removing the latex-dark and latex-dark-auto classes if present. US letter page size would be nice to have but fine to leave out for now. Links in the PDF file seem to work in Chrome but indeed not in Firefox, do you think that this can be fixed? If not, we might have to consider supporting the printing feature in "Chrome only".

More document font sizes would be nice, but I think that adding a "wider" version that fits more text on a page would also be quite useful (side notes will pose a challenge though -> maybe remove them entirely?).

I would agree that removing the margin-top from the first paragraph (or heading) of each page would be good to keep the page layout consistent.

I think that there is a lot that could be done (footnotes on its associated page, headers, etc.). I'm currently not in the position to do much work on this myself but I'm happy to review your work, comment it and merge it once it's ready. It just might take a while until I get to it. Again, thank you so much for your work on this!

vihuna commented 9 months ago

Since the printable version is our main objective, I I would force the background color of the document to true white and the main text to black since to create the most contrast possible.

I agree.

I was able to reproduce the section anchor reload issue. My fix for this would be to add ...

I think is a very good solution, I have tested an it works. Thanks for this great improvement.

For the fonts loading it is fine if it "only" works with font-face. We can add a note to the docs that the @import method is not supported.

It's also needed for the ToC links: if not removed, the pages after the Image are not rendered and some of the ToC links will not work (Paged.js does not finish to render the document until the image is loaded).

Adding the break-* classes would be good! It's not optimal that these have to be added manually but I see that doing it automatically would be quite difficult.

I agree, specifically with complex lists.

The Firefox marker issue is a bit annoying but unless you have a better idea, I think it's fine to leave it as is with a note in the docs

It should be possible to do automatically with javascript the manual solution that I remarked previously. I would like to do it only for Firefox (and without user-agent identification), so a specific "firefox" option must be provided (in some way ...).

Going forward, I think that adjusting overflows would be the most important thing to do. Especially the scrolling code blocks require some text-wrapping

Me too. I want to finish two issues I was working on (I comment them later). I understand that you are referring to "automatic text-wrapping" with CSS (tell me if I'm wrong, I also prefer this way). I just want to note that LaTeX does not wrap text inside verbatim environment, you must use the listings package and use the appropriate option for this.

US letter page size would be nice to have but fine to leave out for now

Well, it's almost finished. Different page designs will be implemented through CSS named pages. My idea was to change the default @page margin dimensions with a more reasonable values, and six named pages corresponding to the default {a4paper, letterpaper} x {10pt, 11pt, 12pt}. I wrote a python script (latex-page-sizes.py.txt) to calculate the dimensions. Still not revised, but I upload it, so you will know where this values come from.

I paused this because I was undecided about merge all print-*.css files, the default ANSI letter would be another named page.

Links in the PDF file seem to work in Chrome but indeed not in Firefox, do you think that this can be fixed?

Unfortunately I think this is not possible. In PDF format, "internal hyperlinks" are not a special type of hyperlinks (special URL): internal links are made using "Link Annotation" objects, and external links using "URI Action" objects (screenshot with unpacked PDF, left Firefox, right Brave):

pdf-firefox-brave

It seems Firefox-pdf always use the "URI Action" for all type of links (as a way to avoid confusion, this links can be removed before printing to pdf).

Firefox external links work correctly. Also, WebKitGTK removes all links. I almost haven't tested WebKit (epiphany in Linux), it should be equivalent to Safari (I will try to test Safari printing on Windows).

If not, we might have to consider supporting the printing feature in "Chrome only"

I understand the advantages, and I see no problem for personal local usage (this user has already written the document using LaTeX.css). But it seems a more delicate question for a document on a public web host, that you want to allow printing (perhaps users doesn't know about LaTeX.css, they are not warned about it before arriving, using their favourite web browsers ...).

Without any doubt, Chrome-based browsers are the recommended ones, and we must highlight it this way in the documentation (just as Paged.js does).

This point should be discussed further.

More document font sizes would be nice, but I think that adding a "wider" version that fits more text on a page would also be quite useful (side notes will pose a challenge though -> maybe remove them entirely?).

I'm beginning to understand that you would like [if possible] an "out of the box" working (i.e., almost without user supervision). I think at this moment is far from possible, but some steps in this direction can be taken. The problem, and the reason because I choose the "all manual" way: I think it's more simple for the users to see the need for a page break at certain line/position in the document, instead of dealing with a page-break (introduced previously by LaTeX.css, not working properly for some reason, and users probably don't know how LaTeX.css inserts this breaks) that they must override and insert their custom one. Also, if you oversaturate the document with forced page-breaks and page-break-avoids, Paged.js may not work properly.

Even LaTeX, by default, relies on the user judgment to verify if text in verbatim environment and sidenotes fits on the page.

About the sidenotes, it's a feature I really like; and unfortunately, there are also other problematic elements. Let's give the sidenotes a chance for the moment, and we can decide about them later.

I need to think more about all this.

I'm currently not in the position to do much work on this myself but I'm happy to review your work ...

I understand, thanks, I appreciate your support and your work on LaTeX.css.

vihuna commented 9 months ago

I have been working lately on three points:

(EDIT: it seems more effective to use display: none; instead of height: 0;)

vihuna commented 9 months ago

More Pagedjs-Firefox issues (this doesn't affect the document rendering)

Firefox users can get some errors with Pagedjs if they also use MathJax:

Pagedjs-MathJax-errors (Sorry for the language, it says: "Not valid markup: wrong number of children in <msub/> tag")

This is due to the "assistive MathML" feature used by MathJax. You can disable it in MathJax options:

MathJax = {
  options: {
    menuOptions: {
      settings: {
        assistiveMml: false,
      }
    }
  }
}
vihuna commented 8 months ago

Before changes in a98dd41, Paged.js settings provided by users were overwritten by LatexCss. And Paged.js configuration had to be done through LatexCss; so there were more limitations, if you want to keep it simple.

After the changes, user settings for Pagedjs are respected, and LatexCss.js provides a startup Promise (which includes MathJax and Prism promises, if they are used, among others), so responsibility lies with the users, if they want to change default Pagedjs + LatexCss settings.

In 965364d the CSS "paged" files have been merged. CSS and JS files have been renamed , and the doc and code comments have been updated to reflect these changes in the file names. Sorry for this quite disruptive changes.

vihuna commented 8 months ago

I have been playing to [try to] understand how media styles work with Paged.js, so different styles can be applied to different @media types (using CSS layers): branch paged-tests-darkmode.

vihuna commented 6 months ago

I have already made some decisions and I have a more specific idea about the tasks in this PR (a task list has been added). Some comments:

vercel[bot] commented 6 months ago

Deployment failed with the following error:

The provided GitHub repository does not contain the requested branch or commit reference. Please ensure the repository is not empty.
vihuna commented 6 months ago

Not much to say about the vite build config, the javascript modules are compiled in library mode, as an iife function. I have also added a "vite preview" command to config, in the base path.

It has not been changed any of the Promises, only the way they are loaded (maybe that change goes unnoticed). Before these changes, the startup promises were evaluated when the old file latexcss-paged.js was loaded. Now, if I'm not wrong, they are evaluated after Pagedjs has started and the DOM is already read.

This is quite important for Mathjax and Prism detection, for example. Before these changes, Mathjax and Prism scripts had to be executed before latexcss-paged.js or they will not be detected (it was not completely detailed in the docs, I think). With that changes in the scheme to load the promises, the only limitation is that Paged.js must start after all the other scripts have already been executed. Above all, the configuration provided by latexcss-paged.js must be executed before Paged.js starts (Paged.js is going to read the configuration before the HTML reading is complete, and evaluates the promises after that):

https://gitlab.coko.foundation/pagedjs/pagedjs/-/blob/main/src/polyfill/polyfill.js?ref_type=heads#L19-L35

That means that a configuration like this will work properly

<script src="latexcss-paged.js"></script>
<script src="https://unpkg.com/pagedjs/dist/paged.polyfill.js"></script>
<script id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>

but you must be lucky with this other ones:

<script id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>
<script src="https://unpkg.com/pagedjs/dist/paged.polyfill.js"></script>
<script src="latexcss-paged.js"></script>
<script defer src="latexcss-paged.js"></script>
<script defer src="https://unpkg.com/pagedjs/dist/paged.polyfill.js"></script>
<script defer id="MathJax-script" src="https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js"></script>

At this moment, I don't know the best solution to avoid that limitation (load Paged.js dynamically?, import it [the Previewer] as a module? ...), so the loading method will stay like this.

vihuna commented 6 months ago

I have never used Vercel, but I suppose that it does some trick with the repo URL for the deployment of PR branches, what fails after you change package.json. Let me know @vincentdoerig, if there is a way to solve this issue with Vercel, or it's better to continue in a new fresh PR. Edit: or is it possible (because I did not find anything about this Vercel error) that I messed up the git history? Edit2: Watching the reflog it seems I make a rebase instead of merge ¿? from a local feature branch. Anyway it seems Vercel has not complained again and there is no problem now.

vihuna commented 6 months ago

Some comments about CSS Paged Media footnotes implementation:

vincentdoerig commented 6 months ago

I appreciate the continuous effort and dedication you've shown, along with your patience on this PR. While I will not address each point individually, please know that I appreciate all the reasoned comments you've provided.

(Nov 23) I'm beginning to understand that you would like [if possible] an "out of the box" working (i.e., almost without user supervision). I think at this moment is far from possible, but some steps in this direction can be taken. The problem, and the reason because I choose the "all manual" way: I think it's more simple for the users to see the need for a page break at certain line/position in the document, instead of dealing with a page-break (introduced previously by LaTeX.css, not working properly for some reason, and users probably don't know how LaTeX.css inserts this breaks) that they must override and insert their custom one. Also, if you oversaturate the document with forced page-breaks and page-break-avoids, Paged.js may not work properly.

You are right, I appreciate your input and agree that creating a "plug-n-play" solution would go beyond the scope of this project. "All manual" should indeed be the way to go and we should mention this in the documentation that while it is almost straight-forward, it does require some work from the developer and should only be used on dynamic content with caution.

I was working in a more complex solution (with no success), finally I prefer to do a trick: I set ~the sidenotes height to 0~ display: none; for sidenotes before the Paged.js chunker processes the pages; and I set ~the height to auto~ display: block; again, before Paged.js renders the pages.

Smart!


(Feb 24) Following the indications (and this make me sad), only Blink-based browsers will be really supported for the moment. Different fixes (or improvements) for other engines can be added later.

This is indeed unfortunate. Let's hope that these bugs get fixed and we will hopefully be able to support more browser engines in the future!

Also following indications about sidenotes, there will be only sidenotes in exactly one of the margins. Managing side notes it's not an easy task. For example, because sidenotes are floating elements, there will be a forced vertical gap (the height of the first of them) between the left and the right side notes. This is not really an issue for LaTeX.css "continuous layout" but it's important for "paged layout", where the side notes should fit into its own page. Also, by default, sidenotes are not going to be displayed.

Understood. From a print perspective this limitation isn't too bad since it's usually better to keep things consistent on one side (except maybe for two-sided books but don't get me started with that haha). What was your reasoning for sidenotes not being displayed by default? Regarding the naming of sidenotes-{inner, outer}, I know this is because of the page terminology, but don't you think it would be less confusing to use left/right respectively instead if we don't support two-page layouts anyways (or keep both and use as alias)?

To keep compatibility with default "continuous layout", sidenotes from one of the margins will be transformed to footnotes. Also, a more different markers should be used, to avoid confusing sidenotes with footnotes.

This is a fair tradeoff but we should not make it too confusing. Creating (regular) sidenotes with a symbol instead of a number is not possible anymore, am I seeing this right?

I think specific version of Paged.js should be loaded.

I agree that this makes a lot of sense. (Versioning in the project in general is also something that has been on my mind for a while now...)

So, because @media print will continue to be used, Paged.js has limited support for media queries, and because it seems also contrary to Paged.js design (remember the previous quote from docs), I have decided that latex-dark and latex-dark-auto classes are not going to be supported in paged LaTeX.css.

I appreciate you exploring the feasibility. I am totally fine with it "only" working in light mode.

That means that a configuration like this will work properly [...] At this moment, I don't know the best solution to avoid that limitation (load Paged.js dynamically?, import it [the Previewer] as a module? ...), so the loading method will stay like this.

The documentation you wrote seems to make it clear enough, no?

I have never used Vercel, but I suppose that it does some trick with the repo URL for the deployment of PR branches, what fails after you change package.json.

I also don't know what that was about, but as you mentioned it seems to be working again.

To be honest, I don't like this "low-level DOM manipulation" used to process the footnotes (needed to transform footnotes to CSS Paged Media footnotes [...] I suppose a lot of improvements can be done (from regexp's, to footnote mark detection method,...), but this is my suggestion, at this moment.

Seems to work great with the few examples I have tried:).

Sometimes, when the footnote mark from some footnote it's close to the bottom of the page, it seems Paged.js does not correctly calculate the necessary space. This issue should be fixed manually forcing a page break before the footnote mark.

Since we both agree that this is not supposed to be an automatic solution, I believe it's acceptable to let the developer handle it.


I'm still having some issues with page breaks (??). In your current version (32f1090), I am only seeing 13 out of the 14 total pages on Chrome (120.0.6099.234). I think it has to do with the indentation of parts of the pre block . This is the last page I currently see: image This issue appears to be extremely weird and specific. Interestingly, if I remove a single line (except for the final curly bracket or the last three dots), it reveals all 14 pages. Alternatively, removing the indentation of the @page { block also seems to address the problem. Furthermore, simply adding an extra line appears to resolve it, as well... image I came to the conclusion that prism is messing with the code blocks (if I remove the prism script, all 14 pages render correctly). When keeping prism, the offending CSS code seems to be style-paged.css L248 where deleting the overflow: visible; property seems to fix the issue at the cost of putting the entire block (and any other code block spanning more than one page) on the following page: image This however completely brakes blocks of code that span more than one full page (they disappear completely with leaving a blank page). My conclusion is that we need to figure out what prism does to the code blocks (I tried a few things and made adjustments to the white space, but it didn't lead to any progress)... Let me know if you have a guess what the root issue could be (or if you even have this error!?).


Regarding the documentation, here are my two cents to what we need:

Feel free to implement things differently or we can discuss them here. I'd be happy to review again (and it shouldn't take this long anymore). Thank you!

vihuna commented 6 months ago

Thanks, I appreciate all your comments and indications. I will read and comment them carefully later.

I'm only going to comment now the "last page missing" issue. I THINK (because I'm not able to reproduce it, this is what I get whit any version of Chromium, more about this later) that what happens is a combination of issues that I was already aware (I don't know if they are interrelated). Sorry for not commenting about this earlier, but at some point I decided go ahead so this PR can be finished some day.

Prism+Paged.js combination results in loos of some new line characters ("\n"), when Paged.js breaks the code block while pagination (with the rest of the document displayed correctly until the last page). I was aware about this issue while working on d2d2336 (this is one of the "individual issues" that took me the longest). In my opinion, Paged.js is responsible of this behaviour, because I verified that all newlines were there after the Prism processing, but I don't know how exactly happens. I have this link from Paged.js repo in my notes:

https://gitlab.coko.foundation/pagedjs/pagedjs/-/blob/main/src/modules/filters/whitespace.js?ref_type=heads#L47

Of course, Prism makes the HTML code more complicated, so makes Paged.js more likely to fail. I also have annotated this link from Prism repo (about "deleted newlines"):

https://github.com/PrismJS/prism/issues/1764

And I was "playing" with this code to patch this issue (I also have annotated that Prism removes the <br />):

    const preEl = document.getElementById("pre-id");
    preEl.innerHTML = preEl.innerHTML.replaceAll(/\n/g, "--newline--");

    Prism.hooks.add('after-highlight', function (env)
      {
        env.element.innerHTML = env.element.innerHTML.replaceAll(/--newline--/g, '<br />');
        env.code = env.element.textContent;
      });

As I already have said before, I decided to leave this issue for later.

Sometimes, Paged.js miscalculates the remaining space in the current page while paginating, so there is some overflowing content not displayed. Apparently, Paged.js has eaten some lines, but it's all there. I have not thought anything about that.

And sometimes happens that Paged.js ends the document rendering prematurely, at the bottom of some page. In fact, it's similar to the previous case, where now Paged.js does not beak the pages anymore: the rest of the document is there, but it doesn't fit the page. I have verified that Paged.js fails to calculate the breakToken so remains undefined and it believes it has reached the end of the document. I was thinking about using Paged.js Handlers to try to make "another complementary check" and change the breakToken if required, but I didn't go so far.

The only solution at this moment for this errors is a manual break-<before/after>: always; (still not committed the "break classes"), before the problematic HTML element (or inside, after splitting it). I will comment further about the breaks in next days. I'll try also to reproduce the error you report.

Finally, the point I'm most interested in: why I don't have this error? Yes, I'm working with an older Chromium version (117.0.5938.149), but I have also tested the latest available in Debian (121.0.6167.160-1~deb12u1) with the same result. I think this is because the "still non-fixed" relative units and also the font used for the preformatted text (I was awaiting until you had some time to comment about this). We should provide a Monospace font, verify the computed fonts, and throw a warning (popup?) if finally uses a fallback font, because the document will not be rendered like the author designed it, in this case. Tell me how we are going to do this (remote web font, or add another font to the repo ... and I just remembered that is style.min.css ~ 3MB because of fonts).

And this also brings to my mind that there is some symbols (end-of-proof, e.g.) not provided by LaTeX.css fonts. I was planning some time ago a PR to add MathML support (because since 10-01-2023 Chrome has MathML-Core support), adding also all the needed symbol fonts.

vihuna commented 6 months ago

«Last page missing» issue updates:

I have been testing the page breaks inside pre elements. And the issue you commented can be reproduced easily.

I have finally come to the conclusion that the "problem" is effectively originated from the CSS code overflow: visible;, as you had shown in your previous comment. This is probably the trigger that finally makes Paged.js fail, but I don't know how this error is internally generated (as I will immediately comment, I get also the same wrong size with Paged.js handlers ¿?). The problem does not happen without this CSS rule, because this way Paged.js does not split the code blocks.

It seems the problem for the pre element comes immediately after the layout is applied. I turns out that el.getBoundingClientRect().height and el.offsetHeight give different values for the same element, and the first of them has the same height as the pagedjs_page area (the paged without the margins). It has some sense (not much) that this happens because of the overflow CSS property and these dimension properties/methods can differ depending on the scroll design, but I can't determine the specific issue: if I take the same HTML code (after been parsed by Prism), paste it to the HTML file, and make the same calculations (outside Paged.js), the values are quite similar. Can this, at this point, be considered a Paged.js bug? But it's also possible that Page.js gets finally confused because the so many spans that Prism adds inside code fragments, or both of them (I'm not able to get the same error with raw code fragments).

What appears to happen next is the same process as when some indivisible element does not fit the page: Paged.js would be trying to assign a breakToken element, finalizing gratefully ("using their own words") after several failed attempts (the breakToken remains undefined).

I have also some doubts, while Paged.js is immersed in this process: the overflow hook is triggered several times, but at the same time the actual returned overflow remains undefined. Is this normal? (if there is not breakToken defined => there is no overflow defined?).

And, after all this, I have more questions: why are there no similar problems with the MathJax processed lines? (also with a lot of span elements). With respect to CSS, what also avoids the issue with code blocks is to change the white-space property to normal (but changing the \n with <br /> elements, so it's very difficult to be sure that this CSS changes cause the fix, because a simple change of one brace with a square bracket also fixes the error, for example). And also works with white-space: pre-line, so we can say the problem arises from the conjunction Paged.js + complex-HTML + CSS-white-spaces.

This is finally not very explanatory, but I think it's enough to "blame" Paged.js for the issue. We have some options to deal with it:

Update(24-03-2024): I have been doing more tests with this issue (and also the page breaks inside the ToC, it has taken me longer than I expected, this things drain my energy).

I'm now pretty sure Paged.js is causing this error, but I don't know the internal Paged.js issues (I don't want to comment more about this, but as final result, Paged.js fails to add the breakToken). I have made some attempt to solve this issue: it's like when you smack an old CRT to get it working. Of course there is always the manual "CSS page break" solution. What worries me most right now are the differences we get when we visualize the document. I hope this will be fixed when all relative units be replaced.

About unit testing: I'm familiar with jest+jsdom, but it's probably not very useful to test this issues with page breaks. Maybe e2e testing could be more useful¿? Until something is done about tests, some of this "manual examples" could be added to the repo.

vihuna commented 5 months ago

What was your reasoning for sidenotes not being displayed by default?

Because hyphenation was disabled, I increased "a little" the sidenotes width. So if you are not going to use any sidenote you'll probably want to decrease the margin width. The margins for no sidenotes are more natural. We can try to find a more balanced solution.

Regarding the naming of sidenotes-{inner, outer}, I know this is because of the page terminology, but don't you think it would be less confusing to use left/right respectively instead if we don't support two-page layouts anyways (or keep both and use as alias)?

OK, I also didn't like it while I was writing it, it's confusing , I didn't want to close the door to some possible features.

This is a fair tradeoff but we should not make it too confusing. Creating (regular) sidenotes with a symbol instead of a number is not possible anymore, am I seeing this right?

There are two (or three) questions here, I'm not sure if it's clear.

The documentation you wrote seems to make it clear enough, no?

I'm not sure it's clear enough. I must check again the loading method (after the "module" changes) and its documentation.

Regarding the documentation:

"Printable design" might not be the best word

Sure. I already was aware about this, I changed it for "paged layout" in commit 965364d2. I Like both "LaTeX.css to PDF" and "Print-friendly documents".

We might also consider adding a toggle for selecting between print layout and continuous layout, though this is not too important.

This will be cool, but could have some work.

Add a banner on top of the paged site for non-Blink browsers that some content may not be displayed correctly (and exporting to pdf will break things like navigation)

I thought also about this. The easiest method is to add an alert after Paged.js has finished. There shouldn't be also any problem if we add a dialog before pagedjs_pages.

vihuna commented 4 months ago

Banner and consistent rendering

To allow consistent rendering across different platforms/environments, I have added the Mono fonts for Latin Modern and Libertinus. I have used the same versions of the fonts: v2.004 (2009) for Latin Modern and v7.020 (2020) for Libertinus. Some notes about Latin Modern (only otf format available):

-> From "otf" to "ttf"

Command:

 fontforge -lang=ff -c 'Open($1); Generate($2); Close();' LMMono-regular.otf LMMono-regular.ttf

Output:

Copyright (c) 2000-2023. See AUTHORS for Contributors.
 License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
 with many parts BSD <http://fontforge.org/license.html>. Please read LICENSE.
 Version: 20230101
 Based on sources from 2023-01-18 18:05 UTC-ML-D-GDK3.
PythonUI_Init()
copyUIMethodsToBaseTable()
Program root: /usr
Warning: Mac and Windows entries in the 'name' table differ for the
  Family string in the language English (US)
  Mac String: Latin Modern Mono
  Windows String: LM Mono 10
Warning: Mac and Windows entries in the 'name' table differ for the
  Styles (SubFamily) string in the language English (US)
  Mac String: 10 Regular
  Windows String: Regular
Warning: Mac and Windows entries in the 'name' table differ for the
  Fullname string in the language English (US)
  Mac String: Latin Modern Mono 10 Regular
  Windows String: LMMono10-Regular
The glyph named Delta is mapped to U+0394.
  But its name indicates it should be mapped to U+2206.
The glyph named Omega is mapped to U+03A9.
  But its name indicates it should be mapped to U+2126.
The glyph named dotlessj is mapped to U+F6BE.
  But its name indicates it should be mapped to U+0237.

-> From "otf" to "woff2"

Command:

woff2_compress LMMono-regular.otf

Output:

Processing LMMono-regular.otf => LMMono-regular.woff2
Compressed 64483 to 39838.

-> From "otf" to "woff"

Command:

sfnt2woff LMMono-regular.otf

Output: [empty]

About the banner, finally I have realized (too late and by accident) that adding the banner after Paged.js has finalized (just like I have done it) is not a great idea: some elements can delay the rendering (like lazy loading fonts) and the banner appears after some seconds. Also some simple web browsers don't support some javascript functions used in Paged.js and you can get a blank page without any warning. Adding the banner before Paged.js will be a better option, if possible. Despite this, I decided to commit this changes, although I didn't finished the banner completely (I was going to allow multiple banners in the same modal box, and close all together), to be able to move forward.

vihuna commented 4 months ago

I have been doing some tests for the "consistent rendering" across different platforms/OS/DE, and I must say that it seems an utopia.

I have tried to solve similar issues in EPUB documents with poor results (from reader internal font-smooth options, to CSS font-kerning, text-rendering ...).

For the moment, these results recommend to limit the use of Paged.js to local/personal environments but not to use it in a public web server. Tell me your opinion.

vihuna commented 4 months ago

Making a recap about the state of this PR

Mi initial idea was: if Paged.js is quite solid with a LaTeX.css document, with this new feature you could (hit two birds with the same stone):

And, also important, implementing all this new feature "in/using the web browser".

I also wanted to show all problems that I was finding along the way while exploring the possibilities, and I've tried to be cautious about the expectations (someone will say that I am always very pessimistic).

But Paged.js is sincerely too much unstable, specially for technical/complex documents like those using LaTeX.css (with quite a few LaTeX+MathJax formulas, code blocks ...); and because the inconsistency across platforms, you have zero confidence that your document is going to be displayed correctly in a third-party browser. (I was hoping that since it was the same application, Chrome, those font-rendering inconsistencies would be quite unusual, or, better said, would rarely affect pagination). And on the other hand, I don't think Paged.js is going to solve all these problems in the short term.

So, probably, "you are only going to use this feature at a developer/author level, to print the document to PDF". Well, I think there are quite better options to do this task (via conversion to LaTeX first), taking into account all that Paged.js issues I have found.

So to be honest (and realistic), at this point I'm not convinced with this PR, and the "exploration days" end here: I already know enough about limitations across platforms and Paged.js instability; and things like "font-kerning consistency" are a red line for me (I have suffered similar issues in the past, and I decided: "never again"). I have explored all the possibilities, and honestly, I don't like any of them.

So, @vincentdoerig,if you still want to merge it, I can finalize the remaining most essential tasks. But I repeat my point of view: if "you are only going to use this feature to print the document to a PDF file", there are much better options to explore (even if you have to install some package in your OS).

Sorry if this is a disappointment for you, but I need to be honest (regardless of the time spent on this PR); tell me your thoughts @vincentdoerig.