Kozea / WeasyPrint

The awesome document factory
https://weasyprint.org
BSD 3-Clause "New" or "Revised" License
7.22k stars 686 forks source link

Apply counter-increment on page counter when counter-reset is used #2114

Open fsteimke opened 7 months ago

fsteimke commented 7 months ago

Hi, i am not an expert for CSS, especially not for paged media. That's why i am not sure whether this is a bug, or a misunderstanding on my side.

This report is motivated by an issue with DocBook next generation stylesheets which produce books in HTML. Rendering with weasyprint gives a very good result, but page numbers in cross references and TOC are always 0 (zero), while the same file rendered with Antenna House has a correct TOC. This leads to issue #2008 for weasyprint, which says that there is no bug in weasyprint, but in the CSS.

So i tried to boil it down to very simple HTML and CSS:

I have to use counter-reset: page two times for the toc flow and the main flow, so that they both start with i and 1.

My conclusion is: there is an issue with page numbers in combination with counter-reset in weasyprint, and placing this instruction at @page context does not help.

samples.zip

Greetings, Frank Steimke

liZe commented 6 months ago

Hi!

Thanks for this report.

First of all, I’d like to avoid using other renderers as a source of truth. Antenna House is amazing, and it’s always useful to compare WeasyPrint with other tools, but it’s not enough to be sure that it’s the right thing to do. I’ve tried Prince (another amazing tool) and it gives results different from both WeasyPrint and Antenna House!

So, let’s focus on the specification. 😄

Page-based counters are created in pages. The text and the example show that the "page" counter is in the page context and should be controlled in the page context.

Of course, you can access these counters in the page content context. But counters are self-nested, meaning that calling counter-reset on a counter that’s already defined in a parent creates a new nested counter. For example:

<style>
  body {counter-reset: counter}
  div {counter-increment: counter}
  div::before {content: counter(counter)}
  article {counter-reset: counter}
  section {counter-increment: counter}
  section::before {content: counter(counter)}
</style>
<body>
  <div>a</div> <!-- 1 -->
  <div>b</div> <!-- 2 -->
  <div>c       <!-- 3 -->
    <article>
      <section>d</section> <!-- 1 -->
      <section>e</section> <!-- 2 -->
    </article>
  </div>
  <div>f</div> <!-- 4 -->
</body>

The counter reset in body is not the same as the one reset in article, even if they have the same name. That’s very useful, because we want the count for sections to be different from the count for divs.

So, calling counter-reset: page in the page content is just the same as this example: it creates a new "page" counter that’s available in the page content, but it shouldn’t change anything to the "page" counter that’s in the page. That’s why I disagree with what Antenna House does.

Let’s get back to your example. The main problem I see is:

@page main {
    counter-reset: page;
}

This rule resets the counter on each "main" page. So, it means that your counter won’t be incremented anymore, because it’s reset on each page.

Here’s a solution that works:

@page {
    @bottom-center {
        content: 'Page ' counter(page, arabic) ' of ' counter(pages) ' pages';
    }
}
@page main {
    counter-reset: page 1;
}
main section:first-child h1 {
    page: main;
}

It looks like we can get what you want: test-ws.pdf

But… Using counter-reset: page 1 instead of counter-reset: page seems to be wrong, though. Having both counter-reset and counter-increment should reset and then increment the counter:

<style>
  section {counter-reset: counter; counter-increment: counter}
  section::before {content: counter(counter)}
</style>
<section></section> <!-- 1 -->

So, as far as I can tell, there’s a bug in WeasyPrint: increment should be applied on page counter even if it’s reset.

Do you agree with that?

fsteimke commented 6 months ago

I find your explanation logical and comprehensible. However, I am a beginner in the field of paged media. I cannot judge whether there are other interpretations of the specification that sound just as correct and understandable.

My main goal is to make the DocBook xslTNG stylesheets as easy to use as possible. I would therefore very much welcome it if they could be used smoothly not only with Antenna House or Prince, but also with WeasyPrint. My impression is that this can work well, but so far there is still the CSS problem with wrong page numbers. (And there is one more software that i'd like to include in the tests: Its Oxygen Chemistry which converts (HTML+CSS) to FOP, it ships with the Oxygen Editor product family).

I have to check if I can integrate your solution into the CSS for xslTNG. I will try, but since I am not an expert, I may need more help.

For now, thank you very much for the detailed analysis and explanation. I will definitely get back to you when I have something new to report from the implementation for the CSS in xslTNG.

In the meantime, can you possibly correct the error with counter-reset()?

Greetings, Frank

Translated with DeepL.com (free version)

liZe commented 6 months ago

In the meantime, can you possibly correct the error with counter-reset()?

OK, let’s focus on this.

fsteimke commented 6 months ago

I am sorry, but I realize that despite your explanations, I am not able to develop the correct solutions for the example.

I do understand that i have to avoid unintentionally creating a new counter called page by a counter-reset in the wrong place. I think that i also do understand that

@page main {
    counter-reset: page;
}

is a problem, because i resets a counter called page on every page called main. But i do not understand, why

@page main {
    counter-reset: page 1;
}

is part of a solution that works? It seems to be exactly the same, the only difference (the initial value) should be irrelevant. I am sure i missed the point where the real difference is, sorry for that.

Maybe you can provide the full code of the correct CSS file which, when applied to the HTML in the sample, will produce the PDF file that you already presented? This would give me a chance to study it in detail.

Thanks in advance, Frank Steimke

liZe commented 6 months ago

I am sure i missed the point where the real difference is, sorry for that.

The difference is:

main section:first-child h1 {
    page: main;
}

The main page is only the first one of the main content. Of course, you can use a better name :).

schneidersoft commented 2 months ago

I have been trying to get a pagecount that is roman for the first set of sections and and decimal for the subsequent sections. While it is possible to set the style of the page number as desired, it seems it is not possible to reset the counter appropriately.

<style>
@page romanpage {
    @bottom-center {
        content: counter(page, lower-roman);
    }
}

@page decimalpage {
    @bottom-center {
        content: counter(page, decimal);
    }
}
#roman {page: romanpage;}
#decimal {page: decimalpage;}
section {page-break-after: always;}
</style>

<section id="roman">page i</section>
<section id="roman">page ii</section>
<section id="decimal">page 1</section>
<section id="decimal">page 2</section>
schneidersoft commented 2 months ago

counters are equally useless for creating automatic numbering unless all the to be numbered dom elements are in the same parent. Otherwise, since counters are scoped, it becomes a fools errand to try and reset them reliably.

liZe commented 2 months ago

Resetting the page counter works in the @page context:

<style>
  @page romanpage {
    @bottom-center {
      content: counter(page, lower-roman);
    }
  }

  @page decimalpage {
    @bottom-center {
      content: counter(page, decimal);
    }
  }
  @page decimalfirst {
    counter-set: page 1;
    @bottom-center {
      content: counter(page, decimal);
    }
  }
  #roman {page: romanpage;}
  #decimal {page: decimalpage;}
  #decimal section:first-of-type {page: decimalfirst;}
  section {page-break-after: always;}
</style>

<div id="roman">
  <section>page i</section>
  <section>page ii</section>
</div>
<div id="decimal">
  <section>page 1</section>
  <section>page 2</section>
</div>

A more simple way to do this would be to handle @page:nth(1 of decimalpage), as explained in #895.

counters are equally useless for creating automatic numbering unless all the to be numbered dom elements are in the same parent. Otherwise, since counters are scoped, it becomes a fools errand to try and reset them reliably.

That’s right, but that’s both unrelated to this issue and a decision of the CSS specification, there’s not much we can do about this here. Let’s keep the discussion about the originally reported issue :smile:.