useblocks / sphinx-simplepdf

A simple PDF builder for Sphinx documentations
https://sphinx-simplepdf.readthedocs.io
MIT License
32 stars 14 forks source link

Table of Contents showing wrong page numbers #60

Closed gitsandi closed 1 year ago

gitsandi commented 1 year ago

Table of Contents is showing wrong page numbers in generated PDF file.

I've created simple documentation structure consisting of index.rst and two files with chapters file1.rst and file.rst.

index.rst

Welcome to Testing SimplePDF's documentation!
=============================================

.. toctree::
   :maxdepth: 2

   file1
   file2

file1.rst

Chapter 1
*********

Text of chapter 1.

Section 1.1
===========

Text of Section 1.1.

Section 1.2
===========

Text of Section 1.2.

file2.rst

Chapter 2
*********

Text of chapter 2.

Section 2.1
===========

Text of Section 2.1.

Section 2.2
===========

Text of Section 2.2.

Pages in the resulting PDF TestingSimplePDF.pdf are:

However, TOC is showing: toc

I know that .. contents:: can be used instead of .. toctree::, but then I would have to include files file1.rst and file2.rst into index.rst. This wouldn't work for me, when having more complex documentation structure with many subdirectories and toctree's in children rst files.

danwos commented 1 year ago

Looks like the internal, not viewable link-target of Chapter 1 and Chapter 2 is part of the page before the content. So the page-break is happening after this link-target and not before.

As Sphinx is creating the HTML for us and weasyprint the transforming it to PDF, it will not be easy to fix it. Maybe we can manipulate the HTML code after the generation, but right now I don't have an idea for a fix.

gitsandi commented 1 year ago

You are right. It seems that, when generating singlehtml, in sphinxsidebar Sphinx references first chapter as a reference to the document file not the chapter. sphinxsidebar In documentwrapper: documentwrapper

This happens only for the first heading in a rst file. When I added another chapter (Chapter X) to file1.rst, it was referenced correctly as chapter-x: chapterX

danwos commented 1 year ago

Interesting analysis. ❤️ This helps already and maybe we can replace the link to the document with one to the chapter.

Only the correct id needs to be calculated from the chapter name, but this should be doable and the logic may be copied from sphinx itself.

gitsandi commented 1 year ago

Code that creates anchors is located in sphinx/environment/collectors/toctree.py

def _make_anchor_name(ids: List[str], num_entries: List[int]) -> str:
    if not num_entries[0]:
        # for the very first toc entry, don't add an anchor
        # as it is the file's title anyway
        anchorname = ''
    else:
        anchorname = '#' + ids[0]
    num_entries[0] += 1
    return anchorname

It was obviously a design decision. Unfortunately, because of this decision, anchor names are not consistent across the html document, which created a problem with page numbers in PDF document.

I submitted a pull request to resolve the issue.