ggrossetie / asciidoctor-web-pdf

Convert AsciiDoc documents to PDF using web technologies
https://asciidoctor.org
MIT License
450 stars 92 forks source link

WIP: Issue 492 Preamble TOC thoughts #495

Closed danyill closed 6 months ago

danyill commented 3 years ago

This is my thinking so far, it's not really any kind of done thing, I've put some comments on #492 and this is really just supporting evidence.

danyill commented 3 years ago

OK, I think we're on the same "page" now :wink:

I've added the css to book.css to remove the border if the toc is within content as a separate commit:

[data-id="content"] [data-id="toc"] {
  border: none;
}

However it might be better to use "asciidoctor" css than paged.js css if possible, so this also works:

#toc.toc {
  border: none;
}

I'm trying to look into how to remove the page numbering on the preamble. It seems that @page preamble style queries don't work. However it's not clear how to diagnose @page media queries in a browser (see here) but I am continuing to research...

danyill commented 3 years ago

To remove the numbering from the preamble, I hoped that the following would work:

@page preamble {
  @bottom-right {
    content: none;
  }
}

It didn't and I wasn't sure how to diagnose @page queries in a browser.

I think this is because there is a conflict between the page rules and :left and :right have higher priority in paged.js. But they shouldn't...

https://www.pagedjs.org/documentation/08-named-pages/#priority-of-page-rules

The spec if I understand correctly would give greater priority to named pages as well.

https://drafts.csswg.org/css-page-3/#cascading-and-page-context

I found one way of doing it (for a single known page number) was the following:

@page :nth(2) { 
  @bottom-right {
    content: none;
  }
}

I think we're possibly hitting: https://gitlab.pagedmedia.org/tools/pagedjs/issues/291

ggrossetie commented 3 years ago

I've converted the pull request to draft, to make it clear that this is WIP.

@danyill Could you please update the expected PDF for the two failing tests (using DEBUG=true npm run test)? That way I can review them in the pull request without checking out locally your branch.

I think we're possibly hitting: gitlab.pagedmedia.org/tools/pagedjs/issues/291

It seems so... 😞

danyill commented 3 years ago

@danyill Could you please update the expected PDF for the two failing tests (using DEBUG=true npm run test)? That way I can review them in the pull request without checking out locally your branch.

Yes of course, done

danyill commented 3 years ago

I think we're possibly hitting: gitlab.pagedmedia.org/tools/pagedjs/issues/291

It seems so...

I spent a little time looking at this. I'm no longer convinced it is this issue.

The div for the page (id="page-2") doesn't show the class pagedjs_named_page or pagedjs_preamble which I think I was expecting to see. So it may be that somehow the construction of the named pages isn't working correctly (or that the css is damaged or hard to parse somehow).

danyill commented 3 years ago

The good people of paged.js (julien) had a look at this for me:

... there is an interesting bug occurring: if you move the preamble outside of the <div class="content">, the name page works as it should. for now you can make it work by removing the .content div https://gitlab.pagedmedia.org/tools/pagedjs/issues/311 to follow up

I need to study this to see if it presents a workaround or solution which is viable in this context. I think it does...

danyill commented 3 years ago

To solve the pagedjs named page issue I thought we could put a filler div above the preamble. This at least allows the named page css to work correctly which then allows us to remove numbering on the preamble pages.

I would have liked to solve the issue in pagedjs but after some time I gave up (I couldn't install it and run tests and I couldn't find where the named pages were processed in the "polisher" part when debugging in the browser).

After this commit I fail a number of tests:

1) should be able to set background color of title page

This one is bothering me if I've set the element to display:none why is it affecting the page layout? It seems to be moved to the top of the content element on page 1 and appears to take up no space. What is happening is that the preamble appears to start lower down on the reference document on page 2. I'm not sure if I should be concerned about this. But see the third test below.

image

2) should break pages accordingly when the document type is book, has preamble, has section and has TOC preamble

This one has the pages numbered differently. The preamble appears to start lower down on the reference document, similar to the above.

3) should break pages accordingly when the document type is book, has preamble, has section and has TOC auto

This one has an extra page between the table of contents and the preamble which has a bottom left footer and is numbered as page 1.

The cause of this is my new element #preamble-book-filler:

image

Even though it is set to display:none it is, alas, wreaking a bit of havoc on the pagedjs layout and receiving its own dedicated page.

Research indicates that the chunker in pagedjs doesn't ignore elements with display:none, see https://gitlab.pagedmedia.org/tools/pagedjs/issues/293 and that you've also found this in https://gitlab.pagedmedia.org/tools/pagedjs/issues/164

4) should break pages accordingly when the document type is book, has preamble and has section

The preamble isn't numbered but the page numbering now is 3 on the final (3rd) page instead of 2 on the reference document.

This seems fine to me as the reference document first section should be page 1 ideally (and is not in either case).

So where have I ended up? By trying to work around one bug in pagedjs I have so far only succeeded in hitting at least one other :cry: I'm not sure which direction to move in. But I have learned some things :smile_cat:

ggrossetie commented 3 years ago

To solve the pagedjs named page issue I thought we could put a filler div above the preamble. This at least allows the named page css to work correctly which then allows us to remove numbering on the preamble pages.

I would have liked to solve the issue in pagedjs but after some time I gave up (I couldn't install it and run tests and I couldn't find where the named pages were processed in the "polisher" part when debugging in the browser).

Paged.js code base has a steep learning curve. It takes time to understand how it works and, for the record, I'm still not very confident when I need to fix an issue because you can often use different approaches but each has implications.

In short, it's not an easy task but I think this is the only way forward. Adding elements has proven to be "counter-productive" in the sense that Paged.js is more reliable when the HTML is lean.

I will try to take a look at https://gitlab.pagedmedia.org/tools/pagedjs/issues/311

I couldn't install it and run tests

For reference, I'm using Docker to run tests:

docker build -t pagedmedia/pagedjs . && docker run -it --security-opt 'seccomp=seccomp.json' -v $(pwd)/specs:/home/node/pagedjs/specs pagedmedia/pagedjs npm test -- --updateSnapshot

It will also update snapshots/screenshots.

My workflow is:

  1. Run npm run build or npm run watch
  2. Create a simple HTML page to reproduce an issue in specs. For instance:

specs/breaks/child-parent-propagation/break-before-container-propagation.html

<!DOCTYPE html>
<html lang="en">
<head>
    <title>break-before-container-propagation</title>
    <meta charset="UTF-8">
    <script src="../../../dist/paged.polyfill.js"></script>
    <style>
        @page {
            size: A4
        }
        section {
            break-before: page;
        }
    </style>
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
</head>
<body>
<header>
    <h1>Document title</h1>
</header>
<main>
    <section>
        <h2>Section 1</h2>
        <p>This is a paragraph.</p>
    </section>
    <section>
        <h2>Section 2</h2>
        <p>This is a paragraph.</p>
    </section>
</main>
</body>
  1. Open HTML file in your browser using a local HTTP server (IntelliJ IDEA does that automatically for me)
  2. Make changes, refresh page...
ggrossetie commented 3 years ago

Found a fix! https://gitlab.pagedmedia.org/tools/pagedjs/merge_requests/155

danyill commented 3 years ago

Found a fix! https://gitlab.pagedmedia.org/tools/pagedjs/merge_requests/155

Thanks! I shall study this. Much appreciate you look into it :bowing_man: :100:

I thought I would try to integrate the MR version of pagedjs to test and hope to do that in the next week or so.

Would you merge it into https://gitlab.pagedmedia.org/mogztter/pagedjs or would you wait for it to be merged into upstream?

Dependabot seems to be pulling from your fork but I couldn't see the latest dependabot tag (?) PR, 0.1.43-next.1617691164. (in #508) in that fork and so I'm a little unsure of the relationship between paged.js and your fork and wondered if there was a third repo somewhere or whether it was initiated locally.

ggrossetie commented 3 years ago

I thought I would try to integrate the MR version of pagedjs to test and hope to do that in the next week or so. Would you merge it into https://gitlab.pagedmedia.org/mogztter/pagedjs or would you wait for it to be merged into upstream?

I need to rebase my fork on the latest release and then apply a few fixes that are not yet merged upstream.

Dependabot seems to be pulling from your fork but I couldn't see the latest dependabot tag (?) PR, 0.1.43-next.1617691164. (in #508) in that fork [...]

I forgot to push the tag 0.1.43-next.1617691164, it's now fixed. Having said that, this version contains a fix that has side effects. In some situation, it will make things better (i.e., it will split a table nicely even when it contains rowspan) but in some other situation it will make things worse... 😞

[...] and so I'm a little unsure of the relationship between paged.js and your fork and wondered if there was a third repo somewhere or whether it was initiated locally.

I can confirm that the npm package https://www.npmjs.com/package/@ggrossetie/pagedjs is published from my fork: https://gitlab.pagedmedia.org/mogztter/pagedjs

danyill commented 6 months ago

I guess this is obsolete by now or would require major rework. Just tidying up.