Open rgrinberg opened 2 years ago
Should we change the behavior of dune to execute
<foo>
as a shell command instead of as a binary to match opam's behavior?
Seems reasonable to me. We will need to quote arguments for the shell (ie Filename.quote_command
, or whicever way opam
does it).
Hope you can bear with this essay! It's moderately tricky to work out which order to put the points in.
The first thing is that if you're writing something portable (I mean, actually portable, not "builds on Linux and BSD ocaml script.ml
is fully portable. If you've written a shell script, it's either so simple it should be an OCaml script or it's so complicated that it won't be portable on Unix either and it should also be an OCaml script. 🙂
However, back to the real world! opam exec
is able to execute shell scripts as a consequence of wanting "./configure"
to "just work" in opam
file, given that that's the first step to building OCaml. So opam exec -- shell-script
works as a consequence of that. However, opam
doesn't give it to you for free - if your opam package installs a shell script on Windows, you get a big warning from opam install
that that's not a good idea because Windows does not support shell scripts. If dune exec -- shell-script
is to be supported, I would strongly recommend that it displays a warning on each and every use on Windows in the same way. The experience of using shell scripts is awkward and full of user surprise - you don't know which shell processor you're going to get, necessarily (are you going to pick up MSYS2's, Cygwin's?) and how paths are guaranteed to be treated. i.e. being able to execute shell scripts like this is an advanced thing to do, and one is not exposing a nice way to use one's build system with it. i.e. it will work for simple cases, but it will be very easy for users to end up puzzled, and they'll probably dune
when they get puzzled!
The detail of how you do it is even more horrible - the code in opam
responsible for this is considerable because Cygwin executables in particular behave differently when invoked from a native Windows executable than when invoked from another Cygwin executable (MSYS2 removes this distinction, but frankly that difference of behaviour makes this puzzle harder, not simpler!). Add to that if you "naively" search for bash
, say, you run a strong risk of invoking WSL's bash
and nothing's going to work!
The world has changed a bit since I wrote the code that's in opam
and this would propose using it considerably outside its original intent. In particular, that code assumed that Cygwin would be the dominant interpreter in PATH
(which opam
guarantees as part of its process execution - it ensures Cygwin's bin
directory is first). That's not necessarily the case for dune exec
, so it'd be necessary to have a closer review/inspection of the search mechanism for the interpreter. The point here is that in order to translate #!/bin/sh
, you need to know which environment that's coming from - i.e. you have to locate a Cygwin/MSYS2 installation and then translate /usr/bin
within that root - i.e. it's not a PATH
search on the Windows PATH
. ocamlbuild
IIRC used to (still does?) a crazy trick to invoke bash scripts which requires bizarre double-quotes to be appended and it's exceedingly unreliable with path quoting and filename conventions (this is why I spent a long time in 2015/2016 on the code in opam...).
So, in summary:
dune exec --
to be able to interpret shebang lines for scripts on native Windows, but it should warn when it does so (you could possibly have an advanced user option to suppress the warning).OpamSystem
which is responsible for this to a separate library and Dune could/should vendor that, rather than trying to duplicate the functionality.Both ocamlbuild (and my) experience of this is that seemingly simple solutions do not work for this! It took an indescribably long time in 2016 to get the earlier Windows port of opam to install findlib (which also has a configure
script) without requiring any patching to the package!
- Yes, I think it's probably sensible for
dune exec --
to be able to interpret shebang lines for scripts on native Windows,
Thanks for the informative answer David! However, one word of clarification. As per what I understood of my discussion offline with @rgrinberg, here we are not proposing interpreting shebang lines, but simply using Sys.command
to execute the argument of dune exec --
. In the case of npm
this would work not because we execute the shell script npm
but because nodejs
also includes in the same directory npm.cmd
, a batch file that can be executed if we use Sys.command
on Windows (as opposed to using Unix.create_process
as today).
Ah, I was possibly barking up the wrong bit of the wrong tree! However, further inspection of the code - opam exec
in ocaml/opam does pretty much exactly as Dune... it's not using Sys.command
. This does work with the opam supplied on Windows in opam-repository-mingw, which I guess is a patch (I don't know where's patches are), but that's not something I'd ever expect to see upstreamed (I think you end up in escaping hell way too easily using Sys.command
here - I think it's good that opam exec
and dune exec
are both for resolving programs to run).
So my original answer can be slightly revised to say that it might be good for both opam exec
and dune exec
to support the shebang script methods (longer term) and (shorter term) to have them both able to execute batch files. The latter is a slightly simpler thing to implement than the full shebang (!!), and still doesn't need to involve Sys.command
.
This was discussed in the meeting today and the verdict is:
We should harmonize the behavior between opam and dune, but this more of a long term goal. There's no need to do this for 3.0.
On windows,
$ dune exec -- <foo>
fails to execute to<foo>
whenever<foo>
is a shell script. This is particularly confusing because$ opam exec -- <foo>
doesn't suffer from this limitation. If you havenpm
installed, you can see yourself with$ opam exec -- npm
vs.$ dune exec -- npm
. Should we change the behavior of dune to execute<foo>
as a shell command instead of as a binary to match opam's behavior?I'd like to hear @dra27 and @nojb 's opinion on this.
Thanks to @tatchi for letting me know of this issue.