Closed dginev closed 10 months ago
()
are special characters for POSIX shells, so pathname_is_nasty
is correct in rejecting those before calling pathname_kpsewhich
(in fact, it may be time to move the defensive checks inside pathname_kpsewhich
– or even better, call kpsewhich
in a safe way instead of using backticks).
call
kpsewhich
in a safe way instead of using backticks
PS: I know how to do this (requires Perl >=v5.8, which I am sure is fine, plus a small extra dependency on Windows because... Windows). I'll send a PR soon.
...
PS: I know how to do this (requires Perl >=v5.8, which I am sure is fine, plus a small extra dependency on Windows because... Windows). I'll send a PR soon.
Scratch that: I have already done that! See
https://github.com/brucemiller/LaTeXML/blob/169d04c9743c5deac5309bef496029f172dd0afe/lib/LaTeXML/Util/Pathname.pm#L403
The open
call uses the list form, which does not spawn a shell, and is thus safe in the face of arbitrary file names. Well, except for Windows – let me send a Windows-specific PR.
So it's actually safe to remove pathname_is_nasty
altogether, modulo the Windows change.
Right, the nasty check has been trying to reject both malformed and suspect paths historically. I don't think we want to accept say a path
$(dirname $(kpsewhich latex.ltx))/..
but your issue rightfully pointed out that parens on their own are acceptable literal characters, as with article/figures (1200x1800px)
. The parens would have to be properly escaped if any system calls are made, agreed. But I think that was already the case for paths with spaces in the name, which are also allowed.
Removing reliance on pathname_is_nasty
entirely could be seen as code cleanup (since regex checks are always fishy), so that sounds appealing to me.
If we have a consensus way of doing that, might as well?
But I think that was already the case for paths with spaces in the name, which are also allowed
I think spaces didn't work before #1592. Before the PR, kpsewhich was called with backticks and no attempt at escaping: arguments with spaces were safe, but the behaviour was broken.
If we have a consensus way of doing that, might as well?
Let me add: after adding proper escaping on Windows.
Fixes #2293 .
This PR is an alternative - or supplemental - approach to #2294. Here, I still keep the defensive checks as they are, but generally allow the
()
characters to be present. There may be a good case to merge both PRs.I also started wondering how many other subtle cases may be silently prevented due to the pathname safety check, so I added a warning. Would be curious to see what variations we see in the next arXiv rerun.