michal-h21 / tex4ebook

Converter from LaTeX to ebook formats (epub, mobi). Using tex4ht and texlua scripts.
315 stars 33 forks source link

Backrefs with hyperlinks are broken (with bibtex) #64

Open JeanOlivier opened 5 years ago

JeanOlivier commented 5 years ago

Tested with BibTeX only as it's what I use with my custom bst files.

OS: Ubuntu 18.04.3 LTS pdfTeX: 3.14159265-2.6-1.40.20 (TeX Live 2019) tex4ebook: v0.2c make4ht: v0.2g.

Consider the following minimal example:

File document.tex:

\documentclass{book}

\usepackage{hyperref}
\usepackage[hyperpageref]{backref}  % Doesn't work with [hyperref] either
\bibliographystyle{unsrt}

\begin{document}

Some citation \cite{newton1687philosophiae}.

\bibliography{document}

\end{document}

File document.bib:

@Book{newton1687philosophiae,
  Title                    = {Philosophiae naturalis principia mathematica},
  Author                   = {Newton, I.},
  Url                      = {https://goo.gl/1ahxL2},
  Year                     = {1687},
  Publisher                = {J. Societatis Regiae ac Typis J. Streater}
}

Compiling in PDF with

pdflatex document
bibtex document
pdflatex document
pdflatex document

generates a backref in the [1] entry of the bibliography pointing to where the entry was cited.

Compiling in epub with

tex4ebook -st -f epub -m draft document
bibtex document
tex4ebook -st -f epub document

yields the backref with a page number (surprisingly) but the hyperlinks points to an invalid target, namely "#page.1". For large documents those will appear as "pages" that seem related to the pdf pages. I'd suggest changing it for section numbers (even if hyperpageref is used) or a simple increasing counter for ebooks.

The paragraph with the citation is the following:

<p class="bibitem" ><span class="biblabel">
 [1]<span class="bibsp">   </span></span><a 
 id="Xnewton1687philosophiae"></a>I. Newton.  <span 
class="cmti-10">Philosophiae naturalis principia mathematica</span>.  J. Societatis
   Regiae ac Typis J. Streater, 1687. <a 
href="#page.1">1</a>
</p>

This can be fixed manually by adding <a id="#page.1"> next to the 1 in [1] in document.html and modifying the href in the bibliography to <a href="document.html#page.1">1</a>

michal-h21 commented 2 years ago

Sorry for the late reply, I totally forgot about this issue and found it again thanks for the reminder from another issue.

The problem is, that there are no pages in HTML files (in Epub there can be page numbers, but these are different than page numbers used by Backref). So Backref inserts links that go back to non-existent destination in HTML.

We can patch Backref to insert link destinations at each \cite command, and link to these destinations from the bibliography. Here is the backref.4ht file which does that:

% patch command that inserts backlink destinations
\pend:defI\Hy@backout{%
  % prevent duplicate backlink on the same page
  \ifcsname bk##1\thepage\endcsname\else%
    % insert link to the page
    \html:addr\Link-{}{X\last:haddr}\EndLink%
    % save link to the .xref file
    \Tag{)Q##1\thepage}{X\last:haddr}%
    % we need to save the link destination in .xref file
    % too, otherwise \Link command would issue warning
    \Tag{)QX\last:haddr}{\FileNumber}%
  \fi
  % declare this backlink destination as used, so we don't 
  % declare another one with the same name
  \expandafter\def\csname bk##1\thepage\endcsname{}%
}

% redefine macro that puts out backlinks
\def\:tempa#1#2#3{%
  % test if we saved link to the current bibitem and page
  \ifTag{)Q\current:back:desc#1}{%
      \Link{\LikeRef{)Q\current:back:desc#1}}{}#1\EndLink%
  }% 
  {#1}% print just page number if there is no saved link
}%
\HLet\backrefxxx\:tempa

% save current bibkey for use in \backrefxxx
\pend:defI\BR@backref{\def\current:back:desc{##1}}