Add Tactician to the platform

LasseBlaauwbroek commented 4 months ago

Very experimental. I modified build.sh to do an ad-hoc Tactician instrumentation. This will of course have to be rewritten.

Status:

Ubuntu: Working
MacOS: Working
Snap: Failing to install Z3 for some reason
Windows: Issues with \r\n vs \n.

To reproduce the windows issue easily:

echo "hello
world" > test.in
opam config subst test
file test.in
file test

The original file shows up normal, but the substituted has \r\n newlines...

LasseBlaauwbroek commented 4 months ago

Mhm, apparently replacing \n with \r\n is a feature of Ocaml in Cygwin: https://discuss.ocaml.org/t/in-cygwin-ocaml-outputs-crlf-line-endings-but-inputs-only-lfs-leading-to-end-of-file/8014

I don't get why OCaml does this if it is a bad thing on Cygwin. Confusing...

MSoegtropIMC commented 3 months ago

I don't get why OCaml does this if it is a bad thing on Cygwin. Confusing...

Neither OCaml nor Cygwin does this, the Windows file API does this when a file is opened in text mode (standard mode). This was intended to make C code compatible to the convention of most DOS editors that lines end with CR-LF in the 80s. This is so ubiquitous on Windows that you have to live with it.

As I said, the most effective way to handle this is to pipe data which you interpret with tools which cannot handle this through a tr -d '\r'.

LasseBlaauwbroek commented 3 months ago

Okay, that is quite unfortunate, because it then means having triple level dune rules with intermediate stages for such files.

Isn't this also still an Opam bug? I'd expect the contract with Opam to be that opam config subst echos the file back verbatim, only substituting variables. I feel that a bug report should be filed somewhere. Just not quite sure where.

MSoegtropIMC commented 3 months ago

I am not aware of any issues with this beyond e.g. saving a path from coqc -where in a shell variable. On WIndows every native program strips carriage returns on reading text files and puts in carriage return on writing text files. This does not apply to cygwin programs (bash, sed, grep) because cygwin is posix. But it does apply to dune, opam, ocamlc, coqc, you name it. The insertion and removal of carriage returns should be transparent to programs. If you are having massive issues with this, you are probably doing something odd like opening a text file in binary mode.

LasseBlaauwbroek commented 3 months ago

The insertion and removal of carriage returns should be transparent to programs.

In this case, it definitely is not. All I do is have a source-file tactician-patch.in the source distribution of Tactician. When Opam downloads it, it will automatically substitute the variables in that file, as is described here: https://opam.ocaml.org/doc/Manual.html#Interpolation This operation is specified in the Opam file like here: https://github.com/coq-tactician/coq-tactician/blob/6b9f3c9f599abe2bb3278c674aab8ca5d0eb44d0/coq-tactician.opam.template#L16-L19 It feels pretty unreasonable to me that Opam will just mangle my source files by changing the line endings. If this was considered reasonable then why doesn't Opam introduce carriage returns into every source files after it downloads them?

MSoegtropIMC commented 3 months ago

I would say on Windows this is expected behaviour. Opam reads the file and writes it and on writing it inserts the carriage returns. This is simply what the file APIs do on WIndows without any special measures. The default settings for many Windows git distributions is similar - insert CR when downloading files and remove CRs when uploading files.

The main sources are usually retrieved as a tar ball and tar ball unpacking is considered a binary operation and doesn't do this - possibly it even uses cygwin tar.

The main question to me is: why do you bother that these files get CRs? For ML files it shouldn't matter. If you have a patch file it of course does matter - you likely end up using cygwin patch which wouldn't like CRs in the patch files.

It should be easy to fix this by piping the two files through tr -d '\r' as a build step.

LasseBlaauwbroek commented 3 months ago

The main question to me is: why do you bother that these files get CRs? For ML files it shouldn't matter.

It's a bash file, and bash can apparently not accept \r\n newlines.

I've now added an extra build step to remove the carriage returns. This seems to work: https://github.com/coq-tactician/coq-tactician/commit/3ed622bd866552d6e1fc162ef8bce2e9f8295d5e

Unfortunately, now I'm running into other issues where I'm trying to execute external commands from OCaml, many of which cannot be resolved (but I'm sure they exist). That includes executing bash, autoconf, ./configure and more. Again very confusing.

MSoegtropIMC commented 3 months ago

It's a bash file, and bash can apparently not accept \r\n newlines.

It depends - the bash we are using here is a Posix bash. If one would compile a shell natively on Windows. it would probably accept this.

Unfortunately, now I'm running into other issues where I'm trying to execute external commands from OCaml, many of which cannot be resolved (but I'm sure they exist). That includes executing bash, autoconf, ./configure and more. Again very confusing.

Think of the Windows build as a cross compilation - it is similar to compiling a Windows program on Linux. Cygwin is in the end a separate light weight operating system. We do it this way because it is the easiest way - getting autonconf stuff running natively on Windows is very tricky. The vast majority of packes has 0 porting effort to Windows - yours unfortunately not.

Having said this: the OCaml infrastructure (Opam, ocamlc) is native Windows - bash, configure scripts ... run on cygwin. It is not a common use case in bulding opam packages the created OCaml programs call back to bash. There are packages which use OCaml programs for configuration, but these then do not call back into bash - the configuration is usually either configure script based or OCaml configure program based but not a mix of it. I need some more information to give you a good advice how to handle this. Basically I need to understand what you want to achieve,

It is btw. the plan of the OCaml / Opam / Dune community to build without a posix environment.

LasseBlaauwbroek commented 3 months ago

The current problem I'm having is with Tactician's instrumentation of packages as they are being installed through Opam. At this point, the Tactician package itself has already been installed (and is mostly working fine).

Tactician has a command tactician exec -- ... that works quite similar to dune exec -- .... As an example, when you run a command like tactician exec -- coqc Whatever.v, the file is compiled with Tactician instrumentation. When you call tactician exec -- dune build, it will build your entire project with Tactician instrumentation. I configure Opam to prefix any installation command with tactician exec --. This all works mostly okay. Except that eventually tactician exec -- my-command has to call the command my-command. And half the time it cannot find that command. (But the command is known to exist and work because the Opam packages compile fine without Tactician's instrumentation.)

I'm calling these commands through standard means, which should be portable as far as I know: https://github.com/coq-tactician/coq-tactician/blob/09b2b07ae8f345172f99eb1cacc0dceaeeb14a61/coq-shim/tactician.ml.in#L386

LasseBlaauwbroek commented 3 months ago

These logs show what the failure looks like: https://github.com/coq/platform/actions/runs/10082830905/job/27877987730#step:5:2252

(Line 2252)

MSoegtropIMC commented 3 months ago

I see. I expected that we create a separate opam switch for tactician and everything built in this switch has tactician instrumentation. My expectation was that you compile a patched coqc and everything using this coqc has instrumentation then.

What does the tactician exec change?

LasseBlaauwbroek commented 3 months ago

It depends on the platform. On Linux, it uses bubblewrap (bwrap) to create a virtual filesystem that replaces coqc and friends with a small shim that automatically adds the required flags that load Tactician into Coq.

On Mac and Windows we don't have bubblewrap, so I have a 'poor mans alternative' that modifies the PATH environment to achieve something similar.

MSoegtropIMC commented 3 months ago

These logs show what the failure looks like: https://github.com/coq/platform/actions/runs/10082830905/job/27877987730#step:5:2252

(Line 2252)

For a Windows native application a shell script is not an executable. You need to start a shell to execute a shell script. On Linix / MacOS you can execute a shell script with system process functions - on Windows not.

A long while back I implemented a more powerful system command for OCaml to overcome this (it did support a reasonable fraction of posix shell and #! analysis, but people didn't see the need for it.

Opam must have some way to handle this - possibly using a library. Tactician should use the same mechanism - or maybe we should look for a simpler solution like a separate switch.

MSoegtropIMC commented 3 months ago

It depends on the platform. On Linux, it uses bubblewrap (bwrap) to create a virtual filesystem that replaces coqc and friends with a small shim that automatically adds the required flags that load Tactician into Coq.

On Mac and Windows we don't have bubblewrap, so I have a 'poor mans alternative' that modifies the PATH environment to achieve something similar.

IMHO this is much too complicated. For practical reasons one would anyway create a separate opam switch for tactician, so there is no need to be able to switch this around dynamically. Opam already has built in support for virtual environments (switches). You should really use this.

MSoegtropIMC commented 3 months ago

This applies to MacOS and Linux as well - I would prefer a separate switch with a modified coqc which always compiles for tactician.

LasseBlaauwbroek commented 3 months ago

For a Windows native application a shell script is not an executable.

So even bash is not an executable? The logs show:

# Windows Subsystem for Linux has no installed distributions.
# Distributions can be installed by visiting the Microsoft Store:
# https://aka.ms/wslstore

So apparently, bash is not even looked for inside the cygwin environment. Even though the cygwin executables are in the PATH...

IMHO this is much too complicated. For practical reasons one would anyway create a separate opam switch for tactician, so there is no need to be able to switch this around dynamically.

I don't want to patch Coq. It is bad form. I used to do this, and people didn't want to use it. (Citing a bunch of reasons, including concerns around trustability.) In practice, you would indeed use a separate switch, and there is nothing stopping anyone from doing this.

MSoegtropIMC commented 3 months ago

Honestly having a switch where coqc is replaced with a shim is much more trustworthy than such a substantial change in the build and package managers. I think we should discuss this on Zulip. With this mecnanism I would object against putting this into Coq Platform because there are so many ways an installation can get messed up that I see substantial maintenance effort coming from this.

regarding bash: it looks like you have some WSL2 shim for bash in the path before cygwin.

LasseBlaauwbroek commented 3 months ago

I can certainly imagine objecting to wanting to put this in the platform. That being said, I don't think it should be because of the build modifications. The mechanism I'm using is fully supported by Opam. Especially on Linux when using bwrap. There are many other somewhat evil hacks in the Tactician plugin that are much riskier...

In any case, I'm afraid that I'm currently not able to commit to massively modifying the injection mechanism. A lot of effort has been put into engineering this to be a nice workflow (at least on Linux and Mac)...

MSoegtropIMC commented 3 months ago

I still don't understand why you did this. What is the problem with a separate opam switch with fixed coqc & Co shims? This would be both technically / implementation wise and from a usability point of view easer - as far as I can see from my helicopter perspective.

LasseBlaauwbroek commented 3 months ago

The design space for the instrumentation mechanism is pretty big, I've had people propose quite a few alternatives to me. I'm not rigidly attached to the current approach, but I'm also not inclined to immediately change things because it is convenient for the platform. Viewing things from a Coq Platform perspective, where everything is fixed and pre-installed, I can see that a permanent shim is attractive. That is, however not how I usually interact with the Coq ecosystem. My switches are fluid and always changing.

As a note, I feel that the difference between the permanent shim you propose and the temporary shim that is actually implemented through tactician exec is not massively big. The basic idea is similar. Some reasons why I implemented tactician exec:

It allows you to dynamically change the machine learning model or its parameters. As in tactician exec --modelB -- coqc Whatever.v. (Not currently implemented in the main branch, but used by Graph2Tac.)
Patching Coq goes against all packaging guidelines. There are two options: (a) When installing Tactician, replace the coqc binary with something else. This is bad because Tactician does not own coqc. Opam has a sanbox, and although I believe it is currently possible to overwrite files you don't own, this could (and perhaps should) easily be restricted in the future. (b) Create a secondary coq package, that is already patched. Possible, but it means polluting the Opam repo with all kinds of alternative coq versions.
Personally, I just find it rather convenient to turn Tactician on and off at will. I use it all the time, especially during development. We also use this for benchmarking purposes.
Many people are quite violently against modifications of Coq. Whether justified or not, this is a real concern.

In the past, @ejgallego has also suggested that Tactician should use Dunes instrumentation facilities. That is also an interesting approach, but will only become viable when all/most Coq packages are built using Dune.

MSoegtropIMC commented 3 months ago

Thanks for the detailed explanations! One more question: what is it that the coqc shim changes for tactician? Does is just set options of a standard coqc?

An alternative would be to have a coq tactician option opam package (only one for all versions - similar to native compute) and do something special in the normal coqc opam packages in case this option package is installed.

LasseBlaauwbroek commented 3 months ago

what is it that the coqc shim changes for tactician? Does is just set options of a standard coqc?

Yes.

An alternative would be [...]

Yes, that is something that could be considered. Would require quite a bit of buy-in from the ecosystem though.

I've now modified tactician exec to use Opam's logic to resolve commands and execute processes. I looked in their source code, and they indeed have a bunch of special logic for Cygwin. Hopefully things will work better now. CI running.

coq / platform

Add Tactician to the platform #423