ocurrent / ocaml-ci

A CI for OCaml projects
https://ocaml.ci.dev
MIT License
112 stars 74 forks source link

CI is broken with non-trivial dependencies #381

Closed hannesm closed 1 year ago

hannesm commented 3 years ago

Hello,

thank you very much for the ocaml-ci. We just encountered an interesting corner case where it does not work: https://ci.ocamllabs.io/github/roburio/happy-eyeballs/commit/4a91542ac035eca1590932c9acb2434875a058f9/variant/alpine-3.13-4.13

The repository happy-eyeballs consists of 3 opam packages: happy-eyeballs, happy-eyeballs-lwt, happy-eyeballs-mirage. The happy-eyeballs package provides a pure implementation of the algorithm/RFC, which is used by dns-client (to first attempt DNS-over-TLS, then DNS-over-TCP - with all nameservers from /etc/resolv.conf). The happy-eyeballs-lwt and happy-eyeballs-mirage depend on dns-client for DNS resolution.

Now the CI fails with the error Error: Library "happy-eyeballs" not found. (while compiling dns-client), though opam reports that installed happy-eyeballs.dev in its output. A manual installation works fine as well (first happy-eyeballs, then dns-client, then happy-eyeballs-lwt/happy-eyeballs-mirage).

Thanks for reading, and maybe there's something deeper to look into how package installations are scheduled?

Best,

Hannes

dra27 commented 3 years ago

This is related to #257 and also nearly came up with odoc (if I remember the technical details correctly, @jonludlam!)

There is indeed something deeper to be done: the solves should be done separately for packages in a multi-package repo to determine which packages depend on other packages in the same repo. From that we'd determine for a given OCaml version two build stages (using happy-eyeballs as the example):

  1. Test happy-eyeballs alone
  2. Install happy-eyeballs and test happy-eyeballs-lwt and happy-eyeballs-mirage simultaneously

There are a few decisions to take here: if we solve each stage independently, we could in theory have one version of happy-eyeballs' dependencies in stage 1 and then another version in stage 2 (because happy-eyeballs-lwt and happy-eyeballs-mirage could introduce extra constraints). Or we can do what we presently do with the single solve for all three packages at simultaneously, and guarantee that the dependencies are always installed at the same version in a given run. I think that's more consistent and will work better for caching.

Then there's the question of which deps you install in stage 1. The mandatory ones (the actual dependencies of happy-eyeballs) are:

ENV DEPS_STAGE1_ACTUAL astring.0.8.5 base-bigarray.base  base-threads.base base-unix.base domain-name.0.3.0 dune.2.9.1 duration.0.2.0 fmt.0.8.10 ipaddr.5.2.0 logs.0.7.0 macaddr.5.2.0 ocaml.4.13.1 ocaml-base-compiler.4.13.1 ocaml-config.2 ocaml-options-vanilla.1 ocamlfind.1.9.1 ocamlbuild.0.14.0 stdlib-shims.0.3.0 topkg.1.0.4

It's worth installing at least the packages from later stages which are depopts of packages in earlier stages to avoid rebuilds and also to ensure that we really are testing against the same dependencies in each stage. That adds:

ENV DEPS_STAGE1_DEPOPTS cmdliner.1.0.4 lwt.5.4.2 base-bytes.base cppo.1.6.8 csexp.1.5.1 dune-configurator.2.9.1 mmap.1.1.0 ocplib-endian.1.1 result.1.5 seq.base

(cmdliner and lwt are the actual depopts). For caching, it's then worth installing all the dependencies from later stages which don't depend on earlier stages provided those dependencies themselves do not have the packages built in later stages as depopts (since that would trigger a rebuild in the later stage anyway, so it's a waste of effort). For happy-eyeballs, that gives:

ENV DEPS_STAGE1_FUTURE asn1-combinators.0.2.6 base.v0.14.1 base64.3.5.0 bigarray-compat.1.0.0 bos.0.2.1 ca-certs.0.2.1 ca-certs-nss.3.71.0.1 conf-gmp.3 conf-gmp-powm-sec.3 conf-pkg-config.2 cstruct.6.0.1 cstruct-sexp.6.0.1 dns.6.0.0 eqaf.0.8 fpath.0.7.3 gmap.0.3.0 hkdf.1.0.4 ipaddr-sexp.5.2.0 lru.0.3.0 metrics.0.3.0 mirage-clock.3.1.0 mirage-crypto.0.10.3 mirage-crypto-ec.0.10.3 mirage-crypto-pk.0.10.3 mirage-crypto-rng.0.10.3 mirage-device.2.0.0 mirage-flow.2.0.1 mirage-kv.3.0.1 mirage-no-solo5.1 mirage-no-xen.1 mirage-protocols.5.0.0 mirage-random.2.0.0 mirage-stack.2.2.0 mirage-time.2.0.1 mtime.1.2.0 num.1.4 ocaml-compiler-libs.v0.12.4 parsexp.v0.14.1 pbkdf.1.2.0 ppx_cstruct.6.0.1 ppx_derivers.1.2.1 ppx_sexp_conv.v0.14.3 ppxlib.0.23.0 psq.0.2.0 ptime.0.8.5 randomconv.0.1.3 rresult.0.7.0 sexplib.v0.14.0 sexplib0.v0.14.0 tls.0.15.0 tls-mirage.0.15.0 x509.0.15.0 zarith.1.12

The only downside (as noted in #257) is that imprecise opam files wouldn't then be caught - but we have opam-dune-lint to aid that. The dependencies for stage 2 are then very small:

ENV DEPS_STAGE2 dns-client.6.0.0 happy-eyeballs.dev

so we still have a good caching story, and I expect we would for most scenarios. This gives a final Dockerfile of:

FROM ocaml/opam@sha256:def8cca230f067db2ad7d32934b2df2d3e1546a5050a29f84fcb0a0852f48a52
USER 1000:1000
WORKDIR /src
RUN sudo chown opam /src
RUN cd ~/opam-repository && (git cat-file -e 071951a7f9a420ee45d953b57b5e540bf0bd82e9 || git fetch origin master) && git reset -q --hard 071951a7f9a420ee45d953b57b5e540bf0bd82e9 && git log --no-decorate -n1 --oneline && opam update -u
RUN mkdir -p './'
COPY --chown=1000:1000 happy-eyeballs.opam happy-eyeballs-mirage.opam happy-eyeballs-lwt.opam ./
RUN opam pin add -yn happy-eyeballs.dev './' && \
    opam pin add -yn happy-eyeballs-mirage.dev './' && \
    opam pin add -yn happy-eyeballs-lwt.dev './'
COPY --chown=1000:1000 . /src/
ENV DEPS_STAGE1_ACTUAL astring.0.8.5 base-bigarray.base  base-threads.base base-unix.base domain-name.0.3.0 dune.2.9.1 duration.0.2.0 fmt.0.8.10 ipaddr.5.2.0 logs.0.7.0 macaddr.5.2.0 ocaml.4.13.1 ocaml-base-compiler.4.13.1 ocaml-config.2 ocaml-options-vanilla.1 ocamlfind.1.9.1 ocamlbuild.0.14.0 stdlib-shims.0.3.0 topkg.1.0.4
ENV DEPS_STAGE1_DEPOPTS cmdliner.1.0.4 lwt.5.4.2 base-bytes.base cppo.1.6.8 csexp.1.5.1 dune-configurator.2.9.1 mmap.1.1.0 ocplib-endian.1.1 result.1.5 seq.base
ENV DEPS_STAGE1_FUTURE asn1-combinators.0.2.6 base.v0.14.1 base64.3.5.0 bigarray-compat.1.0.0 bos.0.2.1 ca-certs.0.2.1 ca-certs-nss.3.71.0.1 conf-gmp.3 conf-gmp-powm-sec.3 conf-pkg-config.2 cstruct.6.0.1 cstruct-sexp.6.0.1 dns.6.0.0 eqaf.0.8 fpath.0.7.3 gmap.0.3.0 hkdf.1.0.4 ipaddr-sexp.5.2.0 lru.0.3.0 metrics.0.3.0 mirage-clock.3.1.0 mirage-crypto.0.10.3 mirage-crypto-ec.0.10.3 mirage-crypto-pk.0.10.3 mirage-crypto-rng.0.10.3 mirage-device.2.0.0 mirage-flow.2.0.1 mirage-kv.3.0.1 mirage-no-solo5.1 mirage-no-xen.1 mirage-protocols.5.0.0 mirage-random.2.0.0 mirage-stack.2.2.0 mirage-time.2.0.1 mtime.1.2.0 num.1.4 ocaml-compiler-libs.v0.12.4 parsexp.v0.14.1 pbkdf.1.2.0 ppx_cstruct.6.0.1 ppx_derivers.1.2.1 ppx_sexp_conv.v0.14.3 ppxlib.0.23.0 psq.0.2.0 ptime.0.8.5 randomconv.0.1.3 rresult.0.7.0 sexplib.v0.14.0 sexplib0.v0.14.0 tls.0.15.0 tls-mirage.0.15.0 x509.0.15.0 zarith.1.12
ENV DEPS_STAGE1 $DEPS_STAGE1_ACTUAL $DEPS_STAGE1_DEPOPTS $DEPS_STAGE1_FUTURE
ENV DEPS_STAGE2 dns-client.6.0.0 happy-eyeballs.dev
RUN opam depext --update -y happy-eyeballs.dev happy-eyeballs-mirage.dev happy-eyeballs-lwt.dev $DEPS_STAGE1 $DEPS_STAGE2
RUN opam install $DEPS_STAGE1
RUN opam exec -- dune build --only-packages=happy-eyeballs @install @check @runtest && rm -rf _build
RUN opam install $DEPS_STAGE2
RUN opam exec -- dune build --only-packages=happy-eyeballs-lwt,happy-eyeballs-mirage @install @check @runtest && rm -rf _build

which seems to be working for me. There might be an optimisation which can be done to do a dune install of happy-eyeballs, but it's probably a good aspect of CI to test the opam installation as well in $DEPS_STAGE2.

hannesm commented 2 years ago

I ran into this issue again at https://github.com/mirage/metrics/pull/53 -- are there any plans to implement a fix, like the one suggested by @dra27?

hannesm commented 2 years ago

Hi again, I'm still curious about this issue - and was wondering whether fixing this issue is on the roadmap? It looks like @dra27 has a good plan about how to fix this.

maiste commented 2 years ago

Hi, with @art-w we have been working on a fix for this issue. We have a fix that works, but it raises two issues:

  1. When building happyeyeballs-lwt, it will try to build the dns-client version from opam using the local version of happyeyeballs. Any broken changes in the API will break the CI as the code is not the same as required. The dns-client version can be pinned in the happyeyeballs repository to solve this problem.
  2. It will break repositories with multiple opam files when there are some private dune executables e.g.
    (executable
    (name example)
    (libraries mylib-impl))

    Where mylib-impl depends on a local package mylib. It fails on dune build --only-package=mylib @check @install @runtest because dune tries to build the executables at the wrong step. A way to solve it is to write the dune file as follows:

    (executables
    (public_names -)
    (names example)
    (libraries mylib-impl)
    (package mylib-impl))

    (The public_names - marks the executables as Do not install, but it only works with the plural form.)

You can have a look at the fix here.

WDYT?

dra27 commented 2 years ago

Thanks for looking at this! I think both of those issues highlighted are reasonably things the developer should have to work around. Using happyeyeballs simply as the example:

  1. The dependency of happyeyeballs-lwt on dns-client is the responsibility of the happyeyeballs repo, so I think it's reasonable that if a PR to the happyeyeballs repo breaks the API of happyeyeballs then the opam files for happyeyeballs-lwt should have to use a pin-depends for a fixing version of dns-client, as you suggest (this kind of "friend PR" is also quite common in MirageOS development; see also #385).
  2. This similarly feels like something where the repo itself would be expected to restructure to deal with it. I'm guessing that at present either happyeyeballs is worked on with dns-client in the same dune workspace, or it's already accepted that happyeyeballs and happyeyeballs-lwt cannot be simultaneously built (i.e. dune build @all doesn't work in the repo without vendoring dns-client as well?
maiste commented 2 years ago

Thanks for the quick reply! I agree with you.

  1. Indeed, this one seems to be a reasonable trade. As it depends on the repo construction, it should be the responsibility of the developers to handle this case and not the CI.
  2. Agree with that too! In this case, it is not only related to happyeyeballs but also to other packages. It's linked with this dune issue open by @talex5 which makes it hard to build examples without having the same issue in the CI. In the case of happyeyeballs, I image dune build @all work if you pin the right dependencies or vendors dns-client (as you suggested).
dra27 commented 2 years ago

Another corner case has cropped up for this as well - the (lint-doc) job fails if it's testing one of odoc's dependencies (e.g. now that odoc depends on camlp-streams, that job is routinely failing).

hannesm commented 1 year ago

Closing, I found another CI system where it just worked. Also, quoting from the PR "So perhaps we should regard the CI rejecting this as a feature?"