tectonic-typesetting / tectonic

A modernized, complete, self-contained TeX/LaTeX engine, powered by XeTeX and TeXLive.
https://tectonic-typesetting.github.io/
Other
3.99k stars 162 forks source link

pdfpages cannot include paths with a `.` #50

Closed ninewise closed 7 years ago

ninewise commented 7 years ago

I'm using the pdfpages package to include other pdf files. It seems like Tectonic can't handle paths with a . or .. in them, while pdflatex can. Quite minimal example:

some.tex:

\documentclass[]{article}
\begin{document}
test
\end{document}

test.tex:

\documentclass[]{article}
\usepackage{pdfpages}
\begin{document}
\includepdf[pages={-}]{./some.pdf}
\end{document}

In this example, the some.pdf is in the same directory as the test.tex. Note that using path some.pdf instead of ./some.pdf will work. I get the similar exceptions when:

Exception:

Running TeX ...
error: something bad happened inside TeX; its output follows:

===============================================================================
(test.tex
LaTeX2e <2016/03/31>
Babel <3.9r> and hyphenation patterns for 83 language(s) loaded.
(article.cls
Document Class: article 2014/09/29 v1.4h Standard LaTeX document class
(size10.clo)) (pdfpages.sty (ifthen.sty) (calc.sty) (eso-pic.sty (atbegshi.sty
(infwarerr.sty) (ltxcmds.sty) (ifpdf.sty)) (keyval.sty) (xcolor.sty (color.cfg)
(xetex.def))) (graphicx.sty (graphics.sty (trig.sty) (graphics.cfg)))
(ppxetex.def)) (test.aux) (pdflscape.sty (lscape.sty) (ifxetex.sty))
! Unable to load picture or PDF file './some.pdf'.
<to be read again> 
                   }
l.6 \includepdf[pages={-}]{./some.pdf}

No pages of output.
Transcript written on test.log.
===============================================================================

error: halted on potentially-recoverable error as specified
pkgw commented 7 years ago

Ah, whoa! I didn't think this would be a problem but it seems that it is. Here's my test case:

\documentclass{article}
\usepackage{graphicx}
\begin{document}
\includegraphics{./TEMP.pdf}
\end{document}

Where TEMP.pdf is a random PDF I had lying around.

I'd think that such a path would get passed directly to the FilesystemIo backend where it ought to work to open it, so this will take some digging.

pkgw commented 7 years ago

The solution to this may be identical to the solution to #31.

rekka commented 7 years ago

Wow, I learned a lot about TeX trying to figure this out. Solution is at the end, but I wanted to share my struggle :)

So it turns out that the correct file name is passed to tectonic, and tectonic has no problem opening it. However the latex code in graphics.sty selects the wrong file type, i.e., not a pdf image, and emits \XeTeXpicfile command instead of \XeTeXpdffile command (via a rule in xetex.def). It seems to misidentify the extension as ./TEMP.pdf instead of .pdf. In fact, it tries to open file .bb. If one surround the file in braces as in

\includegraphics{{./TEMP}.pdf}

everything works fine.

If I use xelatex directly, there are no issues with either version.

I was trying to find a problem in the latex.ltx, graphics.sty and xetex.def files in the bundle that tectonic is using since the files are slightly outdated, but I couldn't see any obvious problem in the diffs with the versions xelatex is using that would explain this.

The fact that .bb is accessed indicates that \Gin@base (the file basename) in graphics.sty is identified as the empty string. Why that does not happen when xelatex is run... I did not know. Specifically, the \Gin@base and \Gin@ext seem to be misidentified in the following code in graphics.sty:

\def\Ginclude@graphics#1{%
  \begingroup
  \let\input@path\Ginput@path
  \filename@parse{#1}%
  \ifx\filename@ext\relax
    \@for\Gin@temp:=\Gin@extensions\do{%
      \ifx\Gin@ext\relax
        \Gin@getbase\Gin@temp
      \fi}%
  \else
    \Gin@getbase{\Gin@sepdefault\filename@ext}%
    \ifx\Gin@ext\relax
       \@warning{File `#1' not found}%
       \def\Gin@base{\filename@area\filename@base}%
       \edef\Gin@ext{\Gin@sepdefault\filename@ext}%
    \fi
  \fi
...

\filename@parse is a macro in latex.ltx and it's definition starts as

 \ifx\filename@parse\@undefined
  \def\reserved@a{./}\ifx\@currdir\reserved@a
    \typeout{^^JDefining UNIX/DOS style filename parser.^^J}
    \def\filename@parse#1{%
      \let\filename@area\@empty
      \expandafter\filename@path#1/\\}
    \def\filename@path#1/#2\\{%
      \ifx\\#2\\%
         \def\reserved@a{\filename@simple#1.\\}%
      \else
         \edef\filename@area{\filename@area#1/}%
         \def\reserved@a{\filename@path#2\\}%
      \fi
      \reserved@a}
  \else\def\reserved@a{[]}\ifx\@currdir\reserved@a
...

So it first checks if \@currdir is defined as ./. Why does that matter? \@currdir is actually the current working directory for setting up the latex.ltx format file, and it is defined using this pretty hack:

  \IfFileExists{./texsys.aux}{\gdef\@currdir{./}}%
    {\IfFileExists{[]texsys.aux}{\gdef\@currdir{[]}}%
      {\IfFileExists{:texsys.aux}{\gdef\@currdir{:}}{}}}
  \ifx\@currdir\@undefined
    \global\let\@currdir\@empty
    \typeout{^^J^^J%
      !! No syntax for the current directory could be found^^J%
      }%
  \fi

So there it is, LaTeX checks if file ./texsys.aux exists to determine if we are running on Unix/DOS! (Note: What systems do use [] and : in their paths???)

Since MemoryIo does not do path normalization, it does not open the existing file texsys.aux when requested to open ./texsys.aux, and we have a problem.


So the solution is to implement a proper path normalization in MemoryIo.

In fact, the following hack works:

  1. insert the following code into memory.rs here:
        let name = if name == OsStr::new("./texsys.aux") {
            OsStr::new("texsys.aux")
        } else {
            name
        };
  1. remove tectonic caches,

  2. rerun.

pkgw commented 7 years ago

Very impressive detective work!

I think I prefer a more general change to have MemoryIo strip off ./ file path prefixes, but I haven't thought about this carefully. Something like (psuedo-code):

while name.starts_with("./") {
   name = name.substring(2);
}

Note that one policy that I want to clarify is that Tectonic will always use Unix-style path names when parsing paths in TeX documents, even when running on Windows — you don't want the interpretation of the document to be platform-dependent.

rekka commented 7 years ago

Thank you for the clarification about the path policy. In that case, tectonic should probably not internally store paths (for MemoryIo, say) using OsString since that is OS dependent, and it also should not use Path/PathBuf to manipulate these paths. Instead, it might be better to just use Strings since paths inside XeTeX files can be only valid UTF-8 I believe... Hm, I actually do not know how XeTeX handles paths containing non-ascii chars. Or use Vec<u8> since that is how XeTeX sees them.

In any case, OsString cannot be directly manipulated, it must be converted into something else first anyway.