ocaml / dune

A composable build system for OCaml.
https://dune.build/
MIT License
1.63k stars 409 forks source link

Execute shell scripts on Windows. #5401

Open rgrinberg opened 2 years ago

rgrinberg commented 2 years ago

On windows, $ dune exec -- <foo> fails to execute to <foo> whenever <foo> is a shell script. This is particularly confusing because $ opam exec -- <foo> doesn't suffer from this limitation. If you have npm installed, you can see yourself with $ opam exec -- npm vs. $ dune exec -- npm. Should we change the behavior of dune to execute <foo> as a shell command instead of as a binary to match opam's behavior?

I'd like to hear @dra27 and @nojb 's opinion on this.

Thanks to @tatchi for letting me know of this issue.

nojb commented 2 years ago

Should we change the behavior of dune to execute <foo> as a shell command instead of as a binary to match opam's behavior?

Seems reasonable to me. We will need to quote arguments for the shell (ie Filename.quote_command, or whicever way opam does it).

dra27 commented 2 years ago

Hope you can bear with this essay! It's moderately tricky to work out which order to put the points in.

The first thing is that if you're writing something portable (I mean, actually portable, not "builds on Linux and BSD "), then you don't want shell scripts, you want OCaml scripts and then ocaml script.ml is fully portable. If you've written a shell script, it's either so simple it should be an OCaml script or it's so complicated that it won't be portable on Unix either and it should also be an OCaml script. 🙂

However, back to the real world! opam exec is able to execute shell scripts as a consequence of wanting "./configure" to "just work" in opam file, given that that's the first step to building OCaml. So opam exec -- shell-script works as a consequence of that. However, opam doesn't give it to you for free - if your opam package installs a shell script on Windows, you get a big warning from opam install that that's not a good idea because Windows does not support shell scripts. If dune exec -- shell-script is to be supported, I would strongly recommend that it displays a warning on each and every use on Windows in the same way. The experience of using shell scripts is awkward and full of user surprise - you don't know which shell processor you're going to get, necessarily (are you going to pick up MSYS2's, Cygwin's?) and how paths are guaranteed to be treated. i.e. being able to execute shell scripts like this is an advanced thing to do, and one is not exposing a nice way to use one's build system with it. i.e. it will work for simple cases, but it will be very easy for users to end up puzzled, and they'll probably dune when they get puzzled!

The detail of how you do it is even more horrible - the code in opam responsible for this is considerable because Cygwin executables in particular behave differently when invoked from a native Windows executable than when invoked from another Cygwin executable (MSYS2 removes this distinction, but frankly that difference of behaviour makes this puzzle harder, not simpler!). Add to that if you "naively" search for bash, say, you run a strong risk of invoking WSL's bash and nothing's going to work!

The world has changed a bit since I wrote the code that's in opam and this would propose using it considerably outside its original intent. In particular, that code assumed that Cygwin would be the dominant interpreter in PATH (which opam guarantees as part of its process execution - it ensures Cygwin's bin directory is first). That's not necessarily the case for dune exec, so it'd be necessary to have a closer review/inspection of the search mechanism for the interpreter. The point here is that in order to translate #!/bin/sh, you need to know which environment that's coming from - i.e. you have to locate a Cygwin/MSYS2 installation and then translate /usr/bin within that root - i.e. it's not a PATH search on the Windows PATH. ocamlbuild IIRC used to (still does?) a crazy trick to invoke bash scripts which requires bizarre double-quotes to be appended and it's exceedingly unreliable with path quoting and filename conventions (this is why I spent a long time in 2015/2016 on the code in opam...).

So, in summary:

Both ocamlbuild (and my) experience of this is that seemingly simple solutions do not work for this! It took an indescribably long time in 2016 to get the earlier Windows port of opam to install findlib (which also has a configure script) without requiring any patching to the package!

nojb commented 2 years ago
  • Yes, I think it's probably sensible for dune exec -- to be able to interpret shebang lines for scripts on native Windows,

Thanks for the informative answer David! However, one word of clarification. As per what I understood of my discussion offline with @rgrinberg, here we are not proposing interpreting shebang lines, but simply using Sys.command to execute the argument of dune exec --. In the case of npm this would work not because we execute the shell script npm but because nodejs also includes in the same directory npm.cmd, a batch file that can be executed if we use Sys.command on Windows (as opposed to using Unix.create_process as today).

dra27 commented 2 years ago

Ah, I was possibly barking up the wrong bit of the wrong tree! However, further inspection of the code - opam exec in ocaml/opam does pretty much exactly as Dune... it's not using Sys.command. This does work with the opam supplied on Windows in opam-repository-mingw, which I guess is a patch (I don't know where's patches are), but that's not something I'd ever expect to see upstreamed (I think you end up in escaping hell way too easily using Sys.command here - I think it's good that opam exec and dune exec are both for resolving programs to run).

So my original answer can be slightly revised to say that it might be good for both opam exec and dune exec to support the shebang script methods (longer term) and (shorter term) to have them both able to execute batch files. The latter is a slightly simpler thing to implement than the full shebang (!!), and still doesn't need to involve Sys.command.

rgrinberg commented 2 years ago

This was discussed in the meeting today and the verdict is:

We should harmonize the behavior between opam and dune, but this more of a long term goal. There's no need to do this for 3.0.