tectonic-typesetting / tectonic

A modernized, complete, self-contained TeX/LaTeX engine, powered by XeTeX and TeXLive.
https://tectonic-typesetting.github.io/
Other
3.96k stars 161 forks source link

warning: open of input |'inkscape' -V failed #859

Open LeCyberDucky opened 2 years ago

LeCyberDucky commented 2 years ago

Hey there!

I'm having a lot of trouble trying to include svg files using the svg package. I have created the following example document:

\documentclass{article}
\usepackage{svg}

\begin{document}
\begin{figure}
    \centering
    \includesvg[width=1.0\linewidth, inkscapelatex=false]{Figures/example}
    \caption[Pls work]{Pls work}
\end{figure}
\end{document}

I'm trying to compile that using the following command:

tectonic -X compile example.tex -Z shell-escape --print -k

but that results in the following problem:

(ts1cmr.fd)warning: open of input |'inkscape' -V  failed
caused by: Syntaksen i filnavnet, mappen eller diskenhedsnavnet er forkert. (os error 123)
error: failed to open input file "|'inkscape' -V "

Note, that, to even get this far, I had to fight other issues related to tectonic not using the newest version of the svg package, I belive. As a workaround, I ended up extracting the .sty files included in the .zip file found here https://github.com/mrpiggi/svg/releases/tag/v2.02k and placing them next to my document. This next part of the problem has me stumped, though. Can somebody help me out here?

pkgw commented 2 years ago

As for your second point, yes, we're still using TeXLive 2020 as a basis for Tectonic's files. The update process is pretty painstaking and I haven't made the time to undertake it yet.

For the primary error, Tectonic is restrictive about invoking external executables because it strives to deliver a high level of reproducibility, so conventions like opening |command for pipe input are often problematic and their use is discouraged. In this particular case, I don't remember off the top of my head whether that convention is something that Tectonic supports at all (currently). It might need to be specifically enabled or it might be totally unimplemented. If the latter, we'd ideally have an error message to that effect.

Is the proper translation of your error message "The syntax of the file name, folder, or volume name is incorrect."?

LeCyberDucky commented 2 years ago

Thanks for the quick response!

As for your second point, yes, we're still using TeXLive 2020 as a basis for Tectonic's files. The update process is pretty painstaking and I haven't made the time to undertake it yet.

Alright, that's the conclusion I came to after reading through the issues for this project. As a small side note: Before figuring out that I could just place the .sty files next to the .tex file, I was actually about to open an issue to get this clarified. This seems to be a common hurdle, so it would be good to have this issue and a workaround like the one I ended up using described in a prominent place, like https://tectonic-typesetting.github.io/book/latest/v2cli/bundle.html. I also read, that you have a lot on your plate already, though, which is of course fair, since this is free, open-srouce software after all :)

Back to the main problem: Yes, I would translate that error message as you did.

Wanting to deliver high level of repoducibility makes sense to me and is a good selling point. On the other hand, however, I would assume that including svg files in the way I'm trying to do must be somewhat common. Choosing not to support that (and I assume other things that work similarly) would probably strike some people as a disadvantage compared to other options. So that's really a situation where not everybody can win, I guess. Maybe a compromise would be to include such functionality but hided it behind an uption, just like I have to do -Z shell-escape?

Please tell me, if you need more information, or if there's something else I can do to help.

pkgw commented 2 years ago

Yes, for this kind of functionality I think the ideal approach would be to implement the support behind an option. There are a lot of TeX packages that make use of shell-escape-like features, which I think is a bit unfortunate, but we have to meet people where they are. Hopefully if Tectonic can provide compelling alternatives, over time various packages will support alternative approaches that keep better encapsulation. (The way I think of it is is, you don't want your C compiler executing other binaries as part of the build process! I don't think your TeX compiler should either. But you do want a framework that lets you use other tools to create the compiler inputs.)

Anyway, the first step here would be to do some research. If you have sufficient familiarity with C, Rust, and TeX's internals, it would be helpful to identify the place where special handling of |command inputs should go and assess the current situation inside Tectonic's guts. My guess is that there is code that I've commented out or otherwise disabled. It would also be good to think about whether this syntax could/should be supported as a kind of shell-escape, or whether it's separate. It probably needs to be, since the invoked command might try to reference input files.

LeCyberDucky commented 2 years ago

That makes perfect sense, and I agree that this sounds like the best approach.

I have no clue about the internals of TeX, but I know my way around C and Rust. I think 2/3 is good enough for me, so I'll give it at shot and try to poke around during the weekend, to see if I can figure something out.

From a quick search, the only place that matches the error string appears to be this line: https://github.com/tectonic-typesetting/tectonic/blob/2f12a40ee8e1ae9092808f06cb6abf675859c81b/crates/bridge_core/src/lib.rs#L550, so I guess that will be a good place to start. Do you have any hints about how you usually debug this? I guess nicely stepping through things from start to finish with a debugger is not possible, since the project combines different components instead of being just one homogeneous piece of e.g. Rust code. Do you just rely on print statements for debugging?

pkgw commented 2 years ago

I actually have amazingly good success using the debugger — there are many Rust crates but since they're all compiled into a single executable, you can step across all of their boundaries. And for me, gdb handles both Rust and C/C++ well enough and understands when and where to switch between them.

I would start by looking in the main (Xe)TeX implementation code, in crates/engine_xetex/xetex, and looking for how start_input() is invoked. I would guess that there might be some custom code that manually implements piped input if the input filename starts with a | ... but I'm not seeing any evidence of that right now.

BTW, have you checked if you get different results if you use -Z shell-escape with your test case?

LeCyberDucky commented 2 years ago

Oh, it's great to hear that using a debugger should work better than I expected. Also, thanks for the hint about looking into the main (Xe)TeX implementation. I'll make sure to take a close look at that.

I don't quite understand what you mean by using -Z shell-escape to check for different results. I already use -Z shell-escape in the string of commands I use for compiling my document, since that is required for svg to be able to launch Inkscape:

tectonic -X compile example.tex -Z shell-escape --print -k

Do you mean that I should include -Z shell-escape somewhere else?

pkgw commented 2 years ago

Re shell-escape: no, sorry, I just didn't reread your original report closely.

LeCyberDucky commented 2 years ago

Hey, just a quick update: I didn't forget about this, I'm just slow :) I got my debugger up and running. It's really cool how it can seamlessly jump between Rust and C. I am still adding breakpoints and stepping through the code in a try to make sense of what's going on. I haven't quite figured out what is going wrong yet, but just as I am writing this, I am noticing the following:

The trait IoProvider is implemented for standard I/O streams, but only the function output_open_stdout is overwritten. I can't find a specific implementation for the function input_open_name, so I assume this falls back to the default implementation, which always returns OpenResult::NotAvailable: https://github.com/tectonic-typesetting/tectonic/blob/62deda8f1b546f5fbfbc2edd22154f942779fbd1/crates/io_base/src/lib.rs#L446 As far as I can tell, the program comes across this function in the case of trying to get the piped input.

I'm still quite confused, and it could very well be, that I'm not interpreting correctly what the debugger is showing, but do you think that this could be the problem, or am I on the wrong track here?

pkgw commented 2 years ago

Hi — great job diving into this!

I thought of another way to investigate this and I found something useful. Here's a commit from more than five years ago where I removed the pipe support:

https://github.com/tectonic-typesetting/tectonic/commit/0adaa8da4fb661e755a2ce832dc947d53ddbfb5d

So basically, the solution to this issue would be to re-enable this behavior — within the context of Tectonic's managed I/O system. The C code has evolved a lot since I made this commit, so reverting the C changes wouldn't necessarily be entirely trivial, but it shouldn't be too complex given the original implementation to look at. The trickier thing would be to figure out how to implement this properly for Tectonic's model. Maybe it could be implemented using the shell-escape infrastructure, the implementation of which runs from here:

https://github.com/tectonic-typesetting/tectonic/blob/62deda8f1b546f5fbfbc2edd22154f942779fbd1/crates/bridge_core/support/support.c#L317

to here:

https://github.com/tectonic-typesetting/tectonic/blob/62deda8f1b546f5fbfbc2edd22154f942779fbd1/src/driver.rs#L677

Maybe the implementation needs some extensions to support shell-escape, or maybe a similar but independent "sysrq" interface needs to be added.

LeCyberDucky commented 2 years ago

That's a great help, thanks for digging it up! I will investigate further and report back.

LeCyberDucky commented 2 years ago

Hey there!

I'm sorry about the long radio silence. For some (off-topic) context, I'm currently working on my MSc thesis, which means two things:

Anyway, I just noticed the transition to the new TexLive 2021 bundle. I'm interested in using that, since it should include a fix that would make some .sty files, that I have currently manually placed in the root directory of my project, obsolete. Thus, I decided to download the newest version of Tectonic that I could find (https://github.com/tectonic-typesetting/tectonic/releases/download/continuous/tectonic-0.8.2+20220330-x86_64-pc-windows-msvc.zip)

With that, I get this output when trying to compile a document:

note: "version 2" Tectonic command-line interface activated
Running TeX ...
(Main.tex
LaTeX2e <2020-10-01> patch level 4
L3 programming layer <2021-02-18> (article.cls
Document Class: article 2020/04/10 v1.4m Standard LaTeX document class
(size10.cloFontconfig error: Cannot load default config file: No such file: (null)
)) (l3backend-xetex.defwarning: open of input |extractbb --version failed
caused by: Syntaksen i filnavnet, mappen eller diskenhedsnavnet er forkert. (os error 123)
error: failed to open input file "|extractbb --version"

I'm using tectonic -X compile Main.tex -Z shell-escape --print to compile a very simple test document consisting only of:

\documentclass{article}

\begin{document}
Jellyfish
\end{document}

I thought that I would add this here, in case other people run into this as well, as I believe it's the same problem. Since this happens for even such tiny test case, I'm guessing that this should affect a lot of people, unless I'm missing something and the problem is just on my end. Regardless, I hope that I will have some free time tonight, so I will try to continue looking into this issue where I left off.

Edit: To verify that this happens with version 0.8.2+20220330, I have tried switching back and forth between https://github.com/tectonic-typesetting/tectonic/releases/download/continuous/tectonic-0.8.2+20220330-x86_64-pc-windows-msvc.zip and https://github.com/tectonic-typesetting/tectonic/releases/download/tectonic%400.8.2/tectonic-0.8.2-x86_64-pc-windows-msvc.zip while remembering to delete the temporary folder where I believe the downloaded tex packages are stored. The problem only occurs with version 0.8.2+20220330, so I assume that some kind of fundamental package has started making use of the same piping sorcery as the svg package in the TexLive 2021 version.

pkgw commented 2 years ago

Ah, yes. I see this when shell-escape is enabled, but not when it's not.

This may raise the priority of this feature, but I would really really like Tectonic to be able to run in shell-escape mode without requiring miscellaneous external tools like extractbb to be installed, so I might prefer to patch the relevant code to disable whatever it's trying to do. Needs some digging.

pkgw commented 2 years ago

@LeCyberDucky I have just updated the default bundle for the latest Tectonic to version tlextras-2021.3r1 (as opposed to ...r0), and this should hopefully fix the |extractbb issue. After looking into things, it seemed as if the only way to deal with the problem was to patch the latex.ltx implementation, so I did that.

But I also merged in pull request #888 to try to provide a bit more context when there's an error with a pipe input command.

LeCyberDucky commented 2 years ago

That's great! I can confirm that my little example from above now compiles with version 0.8.2+20220415 (and 0.8.2+20220330) without running into the |extractbb issue.

I very much appreciate #888. Having spent a bunch of time in the debugger, stepping through the code and jumping up and down the call stack, I have a basic understanding of how the code flows with piped input now. Things get quite a bit confusing the further up the call stack (away from the error) I go, however. But based on https://github.com/tectonic-typesetting/tectonic/pull/888/commits/03f81c0301db3cc4265d95d4e568b9865a019c9c, it looks like I'm on the right track, since my investigations so far have also led me to that point in the code.

Would https://github.com/tectonic-typesetting/tectonic/blob/664eb5adf54e64a0e8cf27760e473659275b2fbf/crates/engine_xetex/xetex/xetex-io.c#L30 be an appropriate place for an initial attempt at implementing support for piped input? I don't mean to put the full implementation right in that spot, but would this be a good place to start handling piped input, or should piped input already be taken care of at some earlier point in the code?

pkgw commented 2 years ago

@LeCyberDucky Yes, I think that's the place there the support would have to be "wired in". The way that things are handled in the code, I think we would need to add a new C API that exposes a mechanism to return a special rust_input_handle_t handle that returns output from the subprocess.

I can think of two approaches to getting that output. The ~classier and probably more efficient way would be to launch the subprocess, let it run in the background, and read its output from a pipe as the engine requests it. But it might make some things easier to just launch the subprocess, snarf up all of its output into a (potentially big) buffer at once, and then feed that output to the engine.

The classic implementation essentially uses the former approach with popen() and pclose(). My impression from taking a look at the manual pages is that if the subprocess exits with an error or is not found, the popen() call will still succeed. So for maximum compatibility with existing implementations we might aim to preserve those semantics. On the other hand, I have the strong suspicion that existing implementations don't necessarily have super well-honed error handling semantics. For instance, as far as I can tell the new |extractbb code doesn't really have any way to know or handle the possibility that extractbb fails or is missing.

LeCyberDucky commented 2 years ago

Hi @pkgw, I have some updates:

For the proof of concept, I have made a copy of the code that handles shell-escape and modified it slightly, in order to collect the generated output. Most of this code is the same, so I'm sure there is a good way to unify things, but I didn't want to mess up the shell-escape interface at the moment.

I then take the collected output and store it in a temporary file (I was not sure how to handle it directly in memory). Then, I provide an InputHandle to this file, so the generated output can be read where it is needed.

You can see my changes here: https://github.com/LeCyberDucky/tectonic/commit/9f473700ccfdfab1ddd283ac29cde85423b8812b

I'm currently running into a problem when the rest of the code actually tries to access the output file, however:

Exception has occurred: W32/0xC0000005
Unhandled exception at 0x00007FF78EA0F81E in tectonic.exe: 0xC0000005: Access violation reading location 0x000000002F76AF1C.

This happens in:

https://github.com/tectonic-typesetting/tectonic/blob/c77ef784564b26114c129c3d49349911835eb822/crates/io_base/src/lib.rs#L231

and originates from:

https://github.com/tectonic-typesetting/tectonic/blob/c77ef784564b26114c129c3d49349911835eb822/crates/engine_xetex/xetex/xetex-io.c#L146

I would appreciate if you could provide some input here:

You mention, that:

I can think of two approaches to getting that output. The ~classier and probably more efficient way would be to launch the subprocess, let it run in the background, and read its output from a pipe as the engine requests it. But it might make some things easier to just launch the subprocess, snarf up all of its output into a (potentially big) buffer at once, and then feed that output to the engine.

I'm not sure how to go about launching the subprocess in the background and reading its output as the engine requests it. As I understand it, the existing shell-escape code currently already blocks until the command is done. So doing the same thing, and then just storing the generated output, was a simple approach for now.

LeCyberDucky commented 2 years ago

So, in order to move on, I have just assumed, that the Access violation error is fixable and that I therefore get a useable handle to the file storing the output of the piped command. To do this, I simply create a handle like this:

make_utf16_name();
// handle = ttstub_input_pipe_open(name_of_file16, name_length16);

char abspath_of_output[1024] = "C:\\Path\\To\\Output\\inkscape -V";
int32_t output_path_length = strlen(abspath_of_output);
handle = ttstub_input_open(abspath_of_output, filefmt, 0);

here: https://github.com/LeCyberDucky/tectonic/blob/af0ea1f00a486f520d9a2c62f211703815526e20/crates/engine_xetex/xetex/xetex-io.c#L34

With that, I don't get the access violation, and things progress further, which leads me to the next problem. After having successfully read the piped input, the next command to be exectued is:

inkscape "Figures/example.svg" -D --export-filename="example_svg-raw.pdf"

Which results in a Can't open file (doesn't exist), because the file is searched for in the temporary shell-escape directory. I suppose this is the result of this comment: https://github.com/tectonic-typesetting/tectonic/blob/202faa1a169a27778af17d2574a3f6d5c1b64c46/src/driver.rs#L702

We basically just hope that nothing will want to access the actual TeX source, which will live in a different directory.

I'm not sure what to do about that.

pkgw commented 2 years ago

The recently filed #909 could give a somewhat hacky way to deal with the latest issue, I think? I haven't reviewed the implementation yet, but I think the general approach is reasonable. (We could perhaps also add a mode to do the shell-escape work in the document directory without having to name it specifically on the command line, if that's going to be the thing that people are doing 95% of the time anyway.)

LeCyberDucky commented 2 years ago

I took a look at https://github.com/tectonic-typesetting/tectonic/pull/909 and merged it into my fork. Indeed, running shell-escape in the document directory resolves that issue.

I'm still a bit unsure about how to handle the commands issued, however. In the original implementation, the supplied command is just executed as one long command. That gave me some issues with this command, however:

inkscape "example/Figures/example.svg" -D --export-filename="example_svg-raw.pdf"

Executing that whole string as one long command results in a problem, where the relative path is not concatenated correctly with the shell-escape working directory. I.e., Inkscape attempts to open

C:\Users\...\SHELL_ESCAPE_WORKING_DIRECTORY\"example\Figures\example.svg" including those quotes that shouldn't be there.

Therefore, I am currently splitting the command on whitespace as a workaround, and then supplying the individual parts as individual arguments. That solves the problem described above, but I don't think splitting on whitespace is a robust solution.

Anyway, with that out of the way, the next command issued is:

if not exist "./svg-inkscape/" mkdir "./svg-inkscape/"

This fails with the following command being supplied by the shell:

The syntax of the command is incorrect.

I can't quite figure out why that is. Copying the command into my terminal manually works as expected. I suppose this may be breaking on my whitespace splitting, so I tried executing the command at once. In that case, I do not get that error, but the wanted directory is not actually created for some reason.

As it stands, I have simply created the directory manually, and with that I'm finally able to compile the document, yay :)

image

bryango commented 1 year ago

Hi guys! Thank you very much for the hard work! I am an average user with

$ tectonic --version
Tectonic 0.12.0

with no understanding of the inner workings of tectonic. In order to use |'inkscape' -V should I simply use the fork of https://github.com/LeCyberDucky/tectonic ? Thanks!

LeCyberDucky commented 1 year ago

Hey there!

Are you simply looking to use inkscape for including svg images in your document? In that case I would not recommend using my fork. As it turns out, there is a simpler workaround.

The problem described in this issue occurs because the svg package attempts to detect which version of Inkscape is installed. It can be avoided by specifying the version manually, when loading the package. See inkscapeversion in section 2.2 of the documentation: https://mirrors.dotsrc.org/ctan/graphics/svg/doc/svg.pdf

bryango commented 1 year ago

It can be avoided by specifying the version manually, when loading the package. See inkscapeversion in section 2.2 of the documentation: mirrors.dotsrc.org/ctan/graphics/svg/doc/svg.pdf

Ah nice! I think this solves my problem! Thank you very much!