math-comp / docker-mathcomp

Docker images of coq-mathcomp [maintainer=@erikmd]
https://hub.docker.com/r/mathcomp/mathcomp/#supported-tags
BSD 3-Clause "New" or "Revised" License
6 stars 2 forks source link

mathcomp-dev image failures due to opam version mismatch #19

Closed palmskog closed 2 years ago

palmskog commented 2 years ago

Since April 1, 2022, I'm seeing CI failures for a bunch of MathComp-related projects using the mathcomp-dev Docker image, for example here for RegLang.

The failure happens during recompilation of Coq, which I believe should not be happening in the first place. Specifically, we first see the following version mismatch between opam 2.0 and opam 2.1:

$ opam update -y
  This version of opam requires an update to the layout of /home/coq/.opam from version 2.0 to version 2.1, which can't be reverted.
  You may want to back it up before going further.

Then, we get the recompilation during a package install:

$ opam install -y -j 2 coq-reglang --deps-only
[...]
 The following actions will be performed:
    - recompile ocaml-variants         4.07.1+flambda [upstream or system changes]
[...]
    - recompile coq                    dev*           [uses ocaml]
[...]

Finally, Coq compilation fails due to lack of Dune(?):

Error:  The compilation of coq.dev failed at "./configure -prefix /home/coq/.opam/4.07.1+flambda -coqide no -native-compiler no".
[...]
Dune could not be found, please ensure you have a working OCaml enviroment

This is likely related to the base image upgrade to opam 2.1 (cc: @gares).

pi8027 commented 2 years ago

IIUC, @erikmd will fix this issue soon: https://github.com/coq-community/docker-base/pull/17#issuecomment-1084946566

erikmd commented 2 years ago

Hi @palmskog @pi8027, Yes, thanks, I'm working on a fix.

To summarize one of the issues that occur:

erikmd commented 2 years ago

Normally, this issue is fixed now. Feel free to reopen otherwise

erikmd commented 2 years ago

@palmskog FWIW, the cron worklfow that was part of docker-coq-action's CI had spotted this issue (albeit I hadn't seen the failure e-mail before you opened this :)

Anyway, I'll take this opportunity to add yet another test in that workflow.

pi8027 commented 2 years ago

I still see this issue here in the following images:

Recompilation of MathComp happens in some other images (e.g., mathcomp/mathcomp:1.12.0-coq-8.13) too, but for different reasons, it seems.

palmskog commented 2 years ago

@erikmd the problem seems to be with num:

+ (script @ line 12) $ opam install --confirm-level=unsafe-yes -j 2 coq-mathcomp-apery --deps-only

  <><> Synchronising pinned packages ><><><><><><><><><><><><><><><><><><><><><><>
  [coq-mathcomp-apery.dev] synchronised (no changes)

  The following actions will be performed:
    - install   seq                          base
    - install   conf-perl                    2
    - recompile num                          1.3*            [upstream or system changes]
    - install   re                           1.10.3
    - install   camlp5                       7.14
    - recompile sexplib                      v0.14.0         [uses num]
    - recompile coq                          8.13.2*         [uses num]

Is maybe some system package that num uses getting upgraded?

erikmd commented 2 years ago

Hi @pi8027 @palmskog, well spotted, thanks for the report.

So it seems the culprit is this commit https://github.com/ocaml/opam-repository/commit/03ab8c0bebc78578f20718b3ff4f6e7c1285912c

There's only one solution in this case: recompiling the images; I'll look after this tonight after my last meeting.

erikmd commented 2 years ago

@pi8027 @palmskog Rebuild done:

If a similar issue occurs again (e.g. if you experience a CI slowdown because a pinned package is recompiled (here, because of major changes in opam-repository packages; beforehand, because of an opam version mismatch)), feel free to open a new issue.