sagemath / sage

Main repository of SageMath
https://www.sagemath.org
Other
1.35k stars 462 forks source link

Some citations are missing from pdf docs #28059

Closed kiwifb closed 5 years ago

kiwifb commented 5 years ago

As reported in https://github.com/cschwan/sage-on-gentoo/issues/543 in 8.8.rc3 some citations are in missing in the generated pdf documentation. A clear example is en/reference/tensor_free_modules/tensor_free_modules.pdf where there are citations missing on the first page of chapter one.

CC: @strogdon

Component: documentation

Author: John Palmieri

Branch/Commit: 67bfb85

Reviewer: François Bissey

Issue created by migration from https://trac.sagemath.org/ticket/28059

kiwifb commented 5 years ago

Attachment: citations.png

kiwifb commented 5 years ago
comment:1

Example

jhpalmieri commented 5 years ago
comment:2

I suppose it must be related to the Sphinx upgrade #26451. I don't think this problem was present in Sage 8.7.

fchapoton commented 5 years ago
comment:3

just to be sure, the same in python2 and python3 ?

kiwifb commented 5 years ago
comment:4

Yes, I suspect the sphinx upgrade had something to do with it. That's the most likely.

Python2 vs python3: I haven't tried yet with python3. I am a bit busy and my first port of call was eliminating texlive2017 vs texlive2019.

dimpase commented 5 years ago
comment:5

I have also noticed that "proper" in-file REFERENCES: sections such as

REFERENCES:

.. [BvL84] \A. Brouwer, J. van Lint,
   Strongly regular graphs and partial geometries,
   Enumeration and design,
   (Waterloo, Ont., 1982) (1984): 85-122.
   http://oai.cwi.nl/oai/asset/1817/1817A.pdf

in graphs/strongly_regular_db.pyx come out as just REFERENCES:, without any content. On the other hand, "improper" (to be fixed, I suppose) in-file references such as

        REFERENCES:

        - [1] http://mathworld.wolfram.com/MatrixTreeTheorem.html

        - [2] Lih-Hsing Hsu, Cheng-Kuan Lin, "Graph Theory and
          Interconnection Networks"

in graphs/generic_graph.py do come out with the proper content.

egourgoulhon commented 5 years ago
comment:6

Replying to @jhpalmieri:

I suppose it must be related to the Sphinx upgrade #26451. I don't think this problem was present in Sage 8.7.

I confirm: with Sage 8.7, running

./sage -docbuild reference/tensor_free_modules pdf

yields a pdf file with the references properly set. It fails with Sage 8.8.beta7 and later versions.

strogdon commented 5 years ago
comment:7

Replying to @egourgoulhon:

Replying to @jhpalmieri:

I suppose it must be related to the Sphinx upgrade #26451. I don't think this problem was present in Sage 8.7.

I confirm: with Sage 8.7, running

./sage -docbuild reference/tensor_free_modules pdf

yields a pdf file with the references properly set. It fails with Sage 8.8.beta7 and later versions.

The Sphinx upgrade was added in 8.8.beta1

dimpase commented 5 years ago
comment:8

I'm adding the branch of #26451 to 8.7 and will try to compare LaTeX files generated by 8.7 with the one from the update.

dimpase commented 5 years ago
comment:9

oops, the problem is that 8.7 doesn't keep generated LaTeX files :-(

So the question is how to make them persist in 8.7.

jhpalmieri commented 5 years ago
comment:10

I see lots of files in local/share/doc/sage/latex/..., for example latex/en/reference/algebras/algebras.tex. If you don't, maybe try sage --docbuild reference latex?

jhpalmieri commented 5 years ago
comment:11

Are there actually active references in the Sage 8.7 pdf files? I don't see the question marks, but instead I just see plain text, not a link to a reference file or to references elsewhere in the same file. So maybe the new Sphinx is actually trying to resolve the links when the old one wasn't, and we have to figure out how to tell it where to find them.

dimpase commented 5 years ago
comment:12

after the sphinx update, one sees lots of

LaTeX Warning: Citation `../references/index:lee2011' on page 8 undefined on in
put line 605.

(one would rather have it getting `../references/references:lee2011')

Probably index comes from it being generated from index.rst.

strogdon commented 5 years ago
comment:13

Replying to @dimpase:

after the sphinx update, one sees lots of

LaTeX Warning: Citation `../references/index:lee2011' on page 8 undefined on in
put line 605.

(one would rather have it getting `../references/references:lee2011')

Probably index comes from it being generated from index.rst.

But the sphinx generated file latex/en/reference/references/references.tex has the bibitem

\bibitem[Lee2011]{index:lee2011}
J. M. Lee, \sphinxstyleemphasis{Introduction to Topological Manifolds}, 2nd ed.,
Springer (New York) (2011); \sphinxhref{https://doi.org/10.1007/978-1-4419-7940-7}{doi:10.1007/978-1-4419-7940-7}

Would things work if index were change to references?

dimpase commented 5 years ago
comment:14

they are hooked up via \sphinxcite, which is an alias to \cite. Which is meant to hook up citations from a bibfile, not from a (La)TeX file.

Moreover, \sphinxcite is supposed to be only introduced in sphinx 1.8.3, whereas we already used it in sphinx 1.7.3, for some reason...

dimpase commented 5 years ago
comment:15

It appears that \sphinxcite is just an alias to \cite, and thus its argument is treated in a normal way, i.e. one would need \bibitem[Lee2011]{index:lee2011} to become

\bibitem[Lee2011]{../references/index:lee2011}

in addition to the knowledge about the location of this bibitem to be properly passed. (something what in "normal" latex would be in a parameter to \bibliography{}.

https://tex.stackexchange.com/questions/497566/meaning-of-cite-foo-barbaz-in-sphinx-generated-latex

strogdon commented 5 years ago
comment:16

I know now to fix things from the Latex side, but how to make Sphinx work is unknown. I will focus on latex/en/reference/tensor_free_modules/tensor_free_modules.tex. Make sure the preamble has the following inserted above the \usepackage{sphinx}.

\usepackage{xcite}
\usepackage{xr-hyper}
\externaldocument{../references/references}
\externalcitedocument{../references/references}
\usepackage{sphinx}

Delete all of the ../references/ from the \sphinxcite{} entries. After the deletions there should only be items like

\sphinxcite{index:god1968}
\sphinxcite{index:lan2002}
 ...

Now rebuild tensor_free_modules.pdf. Here all the labels appear after the citations in tensor_free_modules.pdf. I believe for this to work the stuff under ../references/ must be built. One caveat is that I see

LaTeX Warning: There were multiply-defined labels.

in the .log file but the citations in the pdf are in place.

jhpalmieri commented 5 years ago
comment:17

This would be a stopgap, but if we just want to reproduce the output from Sage 8.7, we need \sphinxcite{../references/index:mil1958} to return the plain text [Mil1958].

(This should only be done for the reference manual. As far as I can tell, references/citations work well in old and new Sage in the other parts of the documentation.)

jhpalmieri commented 5 years ago
comment:18

Replying to @strogdon:

I know now to fix things from the Latex side, but how to make Sphinx work is unknown. I will focus on latex/en/reference/tensor_free_modules/tensor_free_modules.tex. Make sure the preamble has the following inserted above the \usepackage{sphinx}.

\usepackage{xcite}
\usepackage{xr-hyper}
\externaldocument{../references/references}
\externalcitedocument{../references/references}
\usepackage{sphinx}

I think you can modify this and not do any changes in the sphinxcite{...} commands:

\usepackage{xcite}
\usepackage{xr-hyper}
\externaldocument[../references/]{../references/references}
\externalcitedocument[../references/]{../references/references}
% These lines must appear before \usepackage{hyperref}.

These commands take an optional argument which specifies the prefix used in their citations. Then \sphinxcite{../references/index:mil1958} will be converted to index:mil1958, which should then be looked up in the master bibliography file. This renders as plain text [Mil1958], but maybe that's the best you can hope for: PDF files shouldn't have live links to other PDF files, should they? Is that what we would want?

Maybe the best solution would be if Sphinx magically extracted the relevant citations and included only those in each PDF file, but I don't know how to do that. Maybe we can convert our master bibliography file into a BibTeX .bib file and use that?

jhpalmieri commented 5 years ago
comment:19

I think this accomplishes @strogdon's suggestion:

diff --git a/src/doc/en/reference/conf_sub.py b/src/doc/en/reference/conf_sub.py
index 76f6e12909..9344260f61 100644
--- a/src/doc/en/reference/conf_sub.py
+++ b/src/doc/en/reference/conf_sub.py
@@ -63,6 +63,18 @@ latex_documents = [
 ('index', name + '.tex', project, u'The Sage Development Team', 'manual')
 ]

+latex_elements['hyperref'] = r"""
+\usepackage{xcite}
+\usepackage{xr-hyper}
+\externaldocument[../references/]{../references/references}
+\externalcitedocument[../references/]{../references/references}
+% Include hyperref last.
+\usepackage{hyperref}
+% Fix anchor placement for figures with captions.
+\usepackage{hypcap}% it must be loaded after hyperref.
+% Set up styles of URL: it should be placed after hyperref.
+\urlstyle{same}"""
+
 #Ignore all .rst in the _sage subdirectory
 exclude_patterns = exclude_patterns + ['_sage']

by putting the commands in the preamble of each component of the reference manual, along with the appropriate prefix for each citation. (The extra lines come from Sphinx's default setting for the hyperref entry for latex_elements. We can't just put the proposed lines in latex_elements['preamble'], because that gets inserted after \usepackage{hyperref}, whereas these lines must come before \usepackage{hyperref}.)

This does not provide live links, though.

strogdon commented 5 years ago
comment:20

Replying to @jhpalmieri:

Replying to @strogdon:

I know now to fix things from the Latex side, but how to make Sphinx work is unknown. I will focus on latex/en/reference/tensor_free_modules/tensor_free_modules.tex. Make sure the preamble has the following inserted above the \usepackage{sphinx}.

\usepackage{xcite}
\usepackage{xr-hyper}
\externaldocument{../references/references}
\externalcitedocument{../references/references}
\usepackage{sphinx}

I think you can modify this and not do any changes in the sphinxcite{...} commands:

\usepackage{xcite}
\usepackage{xr-hyper}
\externaldocument[../references/]{../references/references}
\externalcitedocument[../references/]{../references/references}
% These lines must appear before \usepackage{hyperref}.

These commands take an optional argument which specifies the prefix used in their citations. Then \sphinxcite{../references/index:mil1958} will be converted to index:mil1958, which should then be looked up in the master bibliography file. This renders as plain text [Mil1958], but maybe that's the best you can hope for: PDF files shouldn't have live links to other PDF files, should they? Is that what we would want?

Maybe the best solution would be if Sphinx magically extracted the relevant citations and included only those in each PDF file, but I don't know how to do that. Maybe we can convert our master bibliography file into a BibTeX .bib file and use that?

Excellent! I may be incorrect, but I think LaTeX parses ../references/references.aux to get bibliographical info. I had thought about the BibTeX approach but that will require some significant work. If this is to work then four lines will have to be inserted into each .tex file by Sphinx. Sphinx auto-generates the files. But where?

jhpalmieri commented 5 years ago
comment:21

Replying to @strogdon:

Excellent! I may be incorrect, but I think LaTeX parses ../references/references.aux to get bibliographical info.

That file will contain the list of citations, but not actual authors, titles, etc.

I had thought about the BibTeX approach but that will require some significant work. If this is to work then four lines will have to be inserted into each .tex file by Sphinx. Sphinx auto-generates the files. But where?

The diff in comment:19 should insert those lines into each LaTeX file.

strogdon commented 5 years ago
comment:22

Replying to @jhpalmieri:

The diff in comment:19 should insert those lines into each LaTeX file.

I've tried this and it does work.

dimpase commented 5 years ago
comment:23

what I don’t understand is why we don’t get proper hyperlinks in pdf this way.

normally pdflatex would create them from \cite macros used with \hyperref package.

strogdon commented 5 years ago
comment:24

Replying to @dimpase:

what I don’t understand is why we don’t get proper hyperlinks in pdf this way.

normally pdflatex would create them from \cite macros used with \hyperref package.

I think this is because the citation is in one pdf and the actual bibliographic entry is in another pdf. I don't know if \hyperref can be made to work across pdf files.

dimpase commented 5 years ago
comment:25

hyperref can work across files, with \ref etc, but not with \cite, for some reason.

jhpalmieri commented 5 years ago
comment:26

By the way, I am still seeing some missing citations. For example when building the combinat part of the reference manual, citations from the graphs manual are missing.

Latexmk: Log file says output to 'combinat.pdf'
Latexmk: List of undefined refs and citations:
  Citation `../graphs/sage/graphs/graph_generators:gqwiki' on page 645 undefined on input line 55570
  Citation `../graphs/sage/graphs/graph_generators:pt09' on page 645 undefined on input line 55570
  Citation `../graphs/sage/graphs/strongly_regular_db:bvl84' on page 588 undefined on input line 50550
  Citation `../graphs/sage/graphs/strongly_regular_db:bvl84' on page 703 undefined on input line 60839
  Citation `../graphs/sage/graphs/strongly_regular_db:gs70' on page 1221 undefined on input line 105131
  Reference `index:module-sage.combinat' on page 3393 undefined on input line 294455
Latexmk: Summary of warnings from last run of (pdf)latex:
  Latex failed to resolve 1 reference(s)
  Latex failed to resolve 5 citation(s)
=== TeX engine is 'pdfTeX'

I think this is to be expected, given the proposed solution. The long term fix is #27497.

jhpalmieri commented 5 years ago
comment:27

A untested shorter term fix is to add more \externalcitedocument[...]{...} commands.

dimpase commented 5 years ago
comment:28

It's probably best to go with #27497 - it's the easiest. I also think we should indeed convert the thing to bibtex

(who was the genius who thought that free from meta-information format for citations is OK :-))

jhpalmieri commented 5 years ago
comment:29

Replying to @jhpalmieri:

A untested shorter term fix is to add more \externalcitedocument[...]{...} commands.

I think a problem with this is that combinat.tex will want to access graphs.aux, which won't be ready until graphs.tex is compiled, while graphs.tex will want to access combinat.aux. So there is no good order to process these.

Regarding #27497: it's going to be a lot tedious work to do this (speaking from experience, since I have moved most of Sage's other references to the master bibliography file). I don't want to do it. Fortunately, for the purposes of this ticket, we only need to move a few references: the ones in combinat that are used elsewhere and the ones in graphs that are used elsewhere.

Converting to bibtex will also be a lot of work. A program like bibweb might help a bit, but who wants to convert all of the current references to bibtex format? I'm certainly not going to stand in anyone's way, but the task is almost too tedious to even consider.

Oh, there is also sphinxcontrib-bibtex, but note this issue.

strogdon commented 5 years ago
comment:30

Replying to @jhpalmieri:

By the way, I am still seeing some missing citations. For example when building the combinat part of the reference manual, citations from the graphs manual are missing.

Latexmk: Log file says output to 'combinat.pdf'
Latexmk: List of undefined refs and citations:
  Citation `../graphs/sage/graphs/graph_generators:gqwiki' on page 645 undefined on input line 55570
  Citation `../graphs/sage/graphs/graph_generators:pt09' on page 645 undefined on input line 55570
  Citation `../graphs/sage/graphs/strongly_regular_db:bvl84' on page 588 undefined on input line 50550
  Citation `../graphs/sage/graphs/strongly_regular_db:bvl84' on page 703 undefined on input line 60839
  Citation `../graphs/sage/graphs/strongly_regular_db:gs70' on page 1221 undefined on input line 105131
  Reference `index:module-sage.combinat' on page 3393 undefined on input line 294455
Latexmk: Summary of warnings from last run of (pdf)latex:
  Latex failed to resolve 1 reference(s)
  Latex failed to resolve 5 citation(s)
=== TeX engine is 'pdfTeX'

I think this is to be expected, given the proposed solution. The long term fix is #27497.

On a sage-on-gentoo build I see things like

(/usr/share/texmf-dist/tex/latex/xcite/xcite.sty
Package: xcite 2011/09/02 v1.0 eXternal Citations (EG)
)
(/usr/share/texmf-dist/tex/latex/hyperref/xr-hyper.sty
Package: xr-hyper 2000/03/22 v6.00beta4 eXternal References (DPC)
)

Package xr Warning:
No file ../references/references.aux
LABELS NOT IMPORTED.
 on input line 35.

Package xc Warning:
No file ../references/references.aux
LABELS NOT IMPORTED.
 on input line 35.

I don't see any reason why this couldn't happen with vanilla sage. For the proposed fix to work it is necessary that the stuff under ../references/* be built and I don't see any way to insure that, especially if things are built in parallel.

jhpalmieri commented 5 years ago
comment:31

I think it should be possible to make en/reference/references a special case when building the reference manual, to build it first.

embray commented 5 years ago
comment:32

Moving open critical and blocker issues to the next release milestone (optimistically).

jhpalmieri commented 5 years ago
comment:33

Build the bibliography first to make sure that the aux file is present:

diff --git a/src/sage_setup/docbuild/__init__.py b/src/sage_setup/docbuild/__init__.py
index 2fa23276ba..f4cca072e2 100644
--- a/src/sage_setup/docbuild/__init__.py
+++ b/src/sage_setup/docbuild/__init__.py
@@ -547,6 +547,9 @@ class ReferenceBuilder(AllBuilder):
                 continue
             output_dir = self._output_dir(format, lang)
             L = [(doc, lang, format, kwds) + args for doc in self.get_all_documents(refdir)]
+            if format == 'pdf' and lang == 'en':
+                logger.warning('Building bibliography')
+                getattr(ReferenceSubBuilder('reference/references', 'en'), 'pdf')(*args, **kwds)
             build_many(build_ref_doc, L)
             # The html refman must be build at the end to ensure correct
             # merging of indexes and inventories.
jhpalmieri commented 5 years ago

Branch: u/jhpalmieri/doc-pdf-citations

jhpalmieri commented 5 years ago

Commit: 67bfb85

jhpalmieri commented 5 years ago

Author: John Palmieri

jhpalmieri commented 5 years ago
comment:35

In principle this is ready for review, but it has some dependencies: some references need to be moved to the master bibliography file. #28084 would do it, but I don't think that all of the references in sage/graphs need to be modified, just those which are referenced elsewhere. I'm not going to mark it as "needs review" until I know which ticket to list as a dependency.


New commits:

67bfb85trac 28059: fix citations in PDF version of reference manual.
vbraun commented 5 years ago
comment:36

Any progress here? Maybe this ticket can already be merged even if other references still need to me moved?

kiwifb commented 5 years ago
comment:37

Replying to @vbraun:

Any progress here? Maybe this ticket can already be merged even if other references still need to me moved?

I will do a test run to check it doesn't break any other doc. If it doesn't, lets merge it.

kiwifb commented 5 years ago
comment:38

Nothing is obviously broken. There may be references missing in some other places.

kiwifb commented 5 years ago

Reviewer: François Bissey

vbraun commented 5 years ago

Changed branch from u/jhpalmieri/doc-pdf-citations to 67bfb85