dspinellis / dgsh

Shell supporting pipelines to and from multiple processes
http://www.spinellis.gr/sw/dgsh/
Other
323 stars 23 forks source link

Don’t wrap external tools that support dgsh #87

Closed lucaswerkmeister closed 7 years ago

lucaswerkmeister commented 7 years ago

As far as I can tell from looking at dgsh’s version of Bash’s execute_cmd.c, dgsh wraps all commands in a pipeline except under the following conditions:

This doesn’t leave much opportunity for external programs that are dgsh-aware and call dgsh_negotiate: dgsh will wrap them in dgsh-wrap, and then their own dgsh_negotiate call will timeout since the negotiation is already over by the time the program starts.

How can dgsh determine that a program is dgsh-aware? One possibility would be to inspect the symbols of the file and check for dgsh_negotiate. If that’s hard to implement (I imagine you can’t just do nm FILE | grep dgsh_negotiate within the shell’s execute_simple_command function ☺), for now a special variable like DGSH_NOWRAP (array of program names) might also work.

(One workaround that works right now is to move/symlink your program into a ./build/libexec/dgsh/ directory, but that means you can’t debug it with tools like valgrind.)

dspinellis commented 7 years ago

I seem to remember proposing to consider programs with a command path …/dgsh/programname to be considered dgsh-compatible, but I can't see that implemented. I think it's a good solution. Do you agree?

lucaswerkmeister commented 7 years ago

It’s still fairly limiting (and it means I have to tweak PATH, none of the standard PATH directories include dgsh of course), but it’s better than build/libexec/dgsh, and if it’s properly documented (in dgsh(1) and dgsh-wrap(1)) it’s probably good enough for now.

dspinellis commented 7 years ago

Thank you. I pushed a commit 8b7f46fd58b02c067553fb194c4097362aa49e5a that does this. If I may ask, what is your program doing? We tried to supply most of the binaries required to work with dgsh as part of it.

mfragkoulis commented 7 years ago

I have a reservation to this change. It's not obvious to me which are the further use cases it serves.

The design so far assumed that all dgsh-aware programs reside in $PREFIX/libexec/dgsh (build/libexec/dgsh is a temporary arrangement used with make test). Although this design restricts the location of dgsh-aware programs, why is this inadequate? External tools can reside there too. Of course, users should be aware of this design and perhaps we can clarify this better in the documentation.

There are also advantages to this specification:

dspinellis commented 7 years ago

Thank you for the critical evaluation. There should be a mechanism for unprivileged users to use dgsh-aware programs. Specifying that any program in a directory path ending in /dgsh/ is dgsh-aware is a logical extension that serves this purpose. The corresponding code even simpler than the original one. The design is orthogonal to that of the Unix PATH: users can specify additional locations for executable programs. An alternative we had discussed was a DGSHPATH environment variable.

lucaswerkmeister commented 7 years ago

If I may ask, what is your program doing? We tried to supply most of the binaries required to work with dgsh as part of it.

The program is ja2l (see especially the dgsh support README section), a small tool to simplify processing of JSON dumps. Functionally, the dgsh-aware version is equivalent to piping the non-dgsh-aware version into dgsh-tee -s (scatter), but I wanted to build dsgh support into the tool, partly just for fun, partly for the slight performance improvement (ja2l already knows where the line breaks are, dgsh-tee -s would have to search for them again in the input stream).

The design so far assumed that all dgsh-aware programs reside in $PREFIX/libexec/dgsh… Although this design restricts the location of dgsh-aware programs, why is this inadequate? External tools can reside there too.

Well, for one, if $PREFIX is /usr and not /usr/local (because I installed dgsh via a package), then that means I can’t drop my program in /usr/local/bin where it belongs (because my own program doesn’t have a package).

logical grouping of tools that share dgsh compatibility

Why is this necessary? As far as I can tell, dgsh-aware programs work even on systems without dgsh installed (I don’t know a lot about linking, but it looks like libdgsh.a is statically linked into the executable… because it’s not libdgsh.so?). In my program, dgsh support is an optional feature – the program works with or without it, so I would prefer putting it in /usr/local/bin or /usr/bin/ like any other program.

dspinellis commented 7 years ago

I think you're right, we need to do better than this. Perhaps we can link with a dynamic dgsh library and have bash check at runtime (and cache) if programs get dynamically linked with it.

dspinellis commented 7 years ago

The correct way seems to involve linking dgsh-compatible programs with a note ELF entry specifying dgsh compatibility. Then dgsh would have to look for such an entry in order to find out if the program is dgsh-compatible.

dspinellis commented 7 years ago

The commit and its corrections work correctly now in 32 bit and 64 bit systems.

lucaswerkmeister commented 7 years ago

Seems to work, thanks a lot!