fchauvel / flap

Flattening LaTeX projects
https://fchauvel.github.io/flap
GNU General Public License v3.0
17 stars 2 forks source link

Invalid substitutions performed within user-defined commands #16

Open fchauvel opened 8 years ago

fchauvel commented 8 years ago

See initial comments from Issue 15 about how FLaP fails when an input directive is used within a user-defined command.

philipptempel commented 8 years ago

It makes sense you closed my other issue since it's unrelated to the problem discussed here ;)

Was just trying to look through flap's code pinning down where the magic happens i.e., where the \inputs and \includes are actually expanded. But couldn't find anything.

In general, I think there are two ways to solving the issue (maybe more, I came up with only these two)

1) Read every line of the document preamble and identify whether an \input or \include is within a defined command. Of course this is cumbersome as there are many many ways to defining a new macro (e.g., \newcommand, \renewcommand, \NewDocumentCommand, \NewRobustCommand, to only mention a few).

2) If flap finds an \input or alike, it should check whether its argument takes any arguments like #1 or #2. I cannot think of a situation where a regular call to an include marco would be used with #1, especially within the actual document i.e., between \begin{document} and \end{document}.

Of course, neither of these solutions covers my use case of:

\DeclareDocumentCommand{\includetikz}{O{0.75\linewidth} O{0.5625\linewidth} m}{%
    \setlength{\figurewidth}{#1}%
    \setlength{\figureheight}{#2}%
    \input{#3.tikz}%
}

however, it will luckily not generate an error thus allowing the document to be flapned.

The exigency of my command may be discussed as I could just create my TikZ files with a fixed width and height rather than setting it in my document. But this way I have more (easier/quicker) control over the resulting figure size and I don't need to write \setlength{...} for every single TikZ-include.

fchauvel commented 8 years ago

Sorry for this late reply.

In this case, not only we must detect the macro and but also expand it so that \includetikz{figure.tikz} becomes:

\setlength{\figurewidth}{0.75\linewidth}%
\setlength{\figureheight}{0.5625\linewidth}%
% Here goes the content content of 'figure.tikz'

Am I right? Is this the behaviour you have in mind?

The challenge is that I do not see how to detect the body of a macro using a regular expression and I am getting convince that this requires a proper TeX/LaTeX parser. One that could know whether or not it has entered the definition of a macro, and how many groups ('{ ... }') it has met so far. Honestly, I am a bit reluctant to engage in one such development as I have little time these days.

I keep thinking about possible workarounds ...

philipptempel commented 8 years ago

Same here, excuse the late response ;)

You are correct in guessing the expected behavior. The command/macro

\includetikz{path/to/figure}

should be extended to

\setlength{\figurewidth}{0.75\linewidth}%
\setlength{\figureheight}{0.5625\linewidth}%
\input{path/to/figure.tikz}%

I, too, think it won't be easy or requires some fancy regexp, however, knowing that Markdown parses exists I think it shouldn't be impossible to do this with a regexp (at least to match the macro's body, replacing its arguments might be more challenging). It would anyway be difficult to match all possible macro-definitions (like \newcommand or \DeclareDocumentCommand to just name a few). Additionally, I have extended the macro to support optional arguments (using xparse) which allow for custom \figurewidth and \figureheight making everything even more complex...

Other tools (like flatex or flatten) seem to also not support this behavior. It would thus be a nice thing to have and also boost popularity of flap. Of course, it must be weighed up against the implementation efforts needed for safe and sound flattening.

Am I the only one having this kind of file that I'm throwing at flap hoping it works? Is it "un-latexy" to have such macros defined? :astonished:

fchauvel commented 8 years ago

I have actually started experimenting with a proper LaTeX parser, and it turns out to be simpler than I initially thought. It also solves other issues, like extracting arguments that are groups (i.e., {...}) for instance, or LaTeX commands that are within a verbatim environment.

I have just started integrating this parser into FLaP, and I am moving slowly towards a new complete version. I will then add new features, including the proper handling a user defined macros.

In my view, such macros makes sense as they avoid duplication—isn't that the whole point of macros? FLaP has, I suppose, too few users in the first place. :wink:

philipptempel commented 7 years ago

How is progress coming along on this issue? I am in the midst of creating a proper LaTeX template for our students at the department and started of with modularized code so that anyone can reuse the files separately in their own projects (i.e., to include all math packages or to include all citation packages).

Right now, the most recent version of flap installed (v0.4.1) bails out with Error: 'str' object has no attribute 'parts' thus I cannot check if it deals correctly with my use case.

fchauvel commented 7 years ago

By the end of the week, I should hopefully release version 0.5, which features a proper LaTeX parser. I am currently testing it. I will then work on substitutions within user-defined macros, and include this in the next version (0.5.1).

fchauvel commented 7 years ago

I have implemented a naive detection of the macros that must be expanded (see the associated test case). It is only works for \def, but for instance, provided that the file extras/test.tex contains blabla:

\def\myinput#1{File: \input{#1}}
\myinput{extras/test}

will be expanded as:

\def\myinput#1{File: \input{#1}}
File: blabla

Unfortunately, I just come to realised—it's never too late—that the general case may be undecidable as the argument given to \input can be transformed arbitrarily within the user-defined macro. In your example, you only added the .tikz extension, but one could, for instance, append the current date. While I can detect and copy the file actually included, I suspect that it is simply not possible to understand how to adjust the argument given at the outset.

philipptempel commented 7 years ago

I will have a look at the most recent version later today - as a paper submission deadline is coming close I am very interested in having a one-file submission (except for includegraphics-files).

Your point with respect to how macros are in fact expanding their arguments is valid. My use case is probably the most simple one and can be handled by flap. Others sure cannot be handled that straightforwardly.

fchauvel commented 7 years ago

No problem: I am abroad myself this week, with less time for programming.

By the way, what would be the macro definition commands that you would most need? As I implemented \def and but others such as \newcommand will require a similar effort now.

philipptempel commented 7 years ago

That took a while to respond for I had some weird constellation of python, pip, and flap versions scattered all over my system. Now that I figured that out, here's my output from flap cable-dynamics.tex output

FLaP 0.5.0
Traceback (most recent call last):
  File "/Users/philipp/Library/Python/3.4/bin/flap", line 9, in <module>
    load_entry_point('FLaP==0.5.0', 'console_scripts', 'flap')()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/ui.py", line 102, in main
    Controller(OSFileSystem(), Display(sys.stdout, verbose)).run(tex_file, output)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/ui.py", line 43, in run
    request.execute()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 79, in execute
    flattened = self._rewrite(self.read_root_tex, str(self.root_tex_file.resource()))
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 88, in _rewrite
    return parser.rewrite()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 78, in rewrite
    result += self._rewrite_one()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 86, in _rewrite_one
    return self._evaluate_one()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 138, in _evaluate_one
    return self.evaluate_command(str(self._next_token))
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 164, in evaluate_command
    return macro.invoke(self)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 124, in invoke
    return self._execute(parser, invocation)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 207, in _execute
    self._flap.relocate_dependency(class_name, invocation)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 109, in relocate_dependency
    self._rewrite(file.content(), file.fullname(), symbol_table)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/util/oofs.py", line 50, in content
    self._content = self.fileSystem.load(self._path)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/util/oofs.py", line 211, in load
    return file.read()
  File "/opt/local/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/codecs.py", line 319, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe8 in position 38214: invalid continuation byte

My project file is based on svmult which is the only file that is being processed using above command. This also seems to be the problem. Replacing \documentclass{svmult} with \documentclass{scrartcl} let's flap run through up until \graphicspath{{figures/}{images}} which fails with the following error code

Traceback (most recent call last):
  File "/Users/philipp/Library/Python/3.4/bin/flap", line 9, in <module>
    load_entry_point('FLaP==0.5.0', 'console_scripts', 'flap')()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/ui.py", line 102, in main
    Controller(OSFileSystem(), Display(sys.stdout, verbose)).run(tex_file, output)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/ui.py", line 43, in run
    request.execute()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 79, in execute
    flattened = self._rewrite(self.read_root_tex, str(self.root_tex_file.resource()))
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 88, in _rewrite
    return parser.rewrite()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 78, in rewrite
    result += self._rewrite_one()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 86, in _rewrite_one
    return self._evaluate_one()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 138, in _evaluate_one
    return self.evaluate_command(str(self._next_token))
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 164, in evaluate_command
    return macro.invoke(self)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 124, in invoke
    return self._execute(parser, invocation)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 288, in _execute
    style_file = self._fetch_style_file(parser, invocation)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 296, in _fetch_style_file
    _, value = each.split("=")
ValueError: need more than 1 value to unpack

I installed flap using pip install --user flap, which didn't install module 'click' so I had to install it manually using pip install --user click. After that, I've gotten above error message.

fchauvel commented 7 years ago

Thanks a lot for the feedback.

It could be that FLaP tries to rewrite the index command? Do you have one? It seems to fail to parse something like:

\makeindex[columns=2, title=Alphabetical Index,
  options= -s my-style.ist]

You are for sure, the first one using this feature!

philipptempel commented 7 years ago

In fact I have a \makeindex command, however, it's a plain \makeindex without any arguments to it (at least not from my main TeX-file, maybe there is something inside the svmult cls-file altering the behavior of \makeindex).

Further digging into the problem from this shell call flap --verbose cable-dynamics.tex output reveals the following:

  1. Replacing documentclass svmult with scrartcl or any other standard document class allows compilation up \makeindex
  2. Removing \makeindex from the source code lets me compile further, however, FLaP then fails when it's processing lines with \includegraphics. I don't know why this is happening as the files are available and the arguments to \includegraphics are not making use of \graphicspath i.e., \includegraphics[keepaspectratio, width=\linewidth, height=3cm]{images/hexapod} is failing even though the file images/hexapod.jpg exists.

It does not seem to be a problem coming from the image being in a subfolder. I created directory output/images manually. No improvement in FLaPing.

Verbose output of FLaP is:

FLaP 0.5.0
File                            Line Column LaTeX Command
-------------------------------------------------------------------------------
cable-dynamics.tex                10      1 \input{includes/packages-options}
cable-dynamics.tex                42      1 \input{includes/packages-main}
includes/packages-main.tex        58      1 \graphicspath{{figures/}{images/}}
cable-dynamics.tex                45      1 \input{includes/layout}
cable-dynamics.tex                46      1 \input{includes/colors}
cable-dynamics.tex                47      1 \input{includes/commands-text}
cable-dynamics.tex                48      1 \input{includes/commands-floats}
cable-dynamics.tex                49      1 \input{includes/commands-math}
cable-dynamics.tex                50      1 \input{includes/tikz-packages}
cable-dynamics.tex                51      1 \input{includes/tikz-commands}
cable-dynamics.tex                52      1 \input{includes/tikz-styles}
cable-dynamics.tex                53      1 \input{includes/my-tikz-styles}
cable-dynamics.tex                54      1 \input{includes/my-commands}
cable-dynamics.tex                60      1 \graphicspath{{figures/}{images}}
cable-dynamics.tex               113      1 \input{content/introduction}
content/introduction.tex          12      5 \includegraphics[keepaspectratio...
Traceback (most recent call last):
  File "/Users/philipp/Library/Python/3.4/bin/flap", line 9, in <module>
    load_entry_point('FLaP==0.5.0', 'console_scripts', 'flap')()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 722, in __call__
    return self.main(*args, **kwargs)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 697, in main
    rv = self.invoke(ctx)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 895, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/click/core.py", line 535, in invoke
    return callback(*args, **kwargs)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/ui.py", line 102, in main
    Controller(OSFileSystem(), Display(sys.stdout, verbose)).run(tex_file, output)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/ui.py", line 43, in run
    request.execute()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 79, in execute
    flattened = self._rewrite(self.read_root_tex, str(self.root_tex_file.resource()))
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 88, in _rewrite
    return parser.rewrite()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 78, in rewrite
    result += self._rewrite_one()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 86, in _rewrite_one
    return self._evaluate_one()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 138, in _evaluate_one
    return self.evaluate_command(str(self._next_token))
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 164, in evaluate_command
    return macro.invoke(self)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 124, in invoke
    return self._execute(parser, invocation)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 317, in _execute
    return parser._spawn(parser._create.as_tokens(content, link), dict()).rewrite()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 78, in rewrite
    result += self._rewrite_one()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 86, in _rewrite_one
    return self._evaluate_one()
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 138, in _evaluate_one
    return self.evaluate_command(str(self._next_token))
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/parser.py", line 164, in evaluate_command
    return macro.invoke(self)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 124, in invoke
    return self._execute(parser, invocation)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 393, in _execute
    new_link = self.update_link(parser, link, invocation)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/latex/macros.py", line 409, in update_link
    return self._flap.update_link(link, invocation)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 123, in update_link
    return self._update_link(path, invocation, self.graphics_directory, ["pdf", "png", "jpeg", "jpg", "ps", "eps"], GraphicNotFound(None))
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 142, in _update_link
    resource = self._find(path, location, extensions, error)
  File "/Users/philipp/Library/Python/3.4/lib/python/site-packages/flap/engine.py", line 175, in _find
    raise error
flap.engine.GraphicNotFound: None
fchauvel commented 7 years ago

Thank you again for the feedback.

I guess it is due to graphicpath, though. FLaP searches for the images/hexapod.jpg in the directory listed in the graphicspath command, that is figures/images/images/hexapod. Is this correct?

You mention that your includegraphics does not use the graphicspath command, so I must be wrong about the semantic of graphicpath. How do you specify that a given includegraphics macro should not use graphicspath? Is there some sort of a scope mechanism, or should we expect FLaP to still search the "root" directory when it cannot found files in the directories listed by the graphicspath command?