ocaml / dune

A composable build system for OCaml.
https://dune.build/
MIT License
1.61k stars 401 forks source link

pkg: fails to build from source #10957

Open avsm opened 1 week ago

avsm commented 1 week ago

On both a Mac and Linux, this sequence of commands fails:

$ opam switch create ns 5.2.0
$ opam pin add dune --dev
$ eval $(opam env)
$ git clone https://github.com/moby/vpnkit
$ cd vpnkit
$ dune pkg lock
<works>
$ dune build
File "dune.lock/ocaml.pkg", line 7, characters 3-8:
7 |    ocaml
       ^^^^^
Error: Logs for package ocaml
../../../../../../../_private/default/.pkg/ocaml-base-compiler/target/bin/ocaml: line 2: /Users/avsm/src/git/moby/vpnkit/_build/.sandbox/68233e7c5570bb1ffbe3310e228f8edc/_private/default/.pkg/ocaml-base-compiler/target/bin/ocamlrun: No such file or directory
../../../../../../../_private/default/.pkg/ocaml-base-compiler/target/bin/ocaml: line 2: exec: /Users/avsm/src/git/moby/vpnkit/_build/.sandbox/68233e7c5570bb1ffbe3310e228f8edc/_private/default/.pkg/ocaml-base-compiler/target/bin/ocamlrun: cannot execute: No such file or directory

What's going on here? I can't use the binary distribution, so what do I need to build to get a dune with the new package management features enabled? And why aren't they enabled in the mainline dune?

leostera commented 1 week ago

Hi @avsm! 👋🏼 We'll be updating the website with source-install instructions soon.

In the meantime you can enable the follow feature flags before you call dune:

export DUNE_CONFIG__TOOLCHAINS=enabled
export DUNE_CONFIG__PKG_BUILD_PROGRESS=enabled
export DUNE_CONFIG__LOCK_DEV_TOOL=enabled

These flags are enabled at compile-time during the binary distribution build, but can also be set at runtime. They are not enabled by default because they change some behaviors (for ex. dune fmt behaves differently).

Or if you use Nix you can call nix build <os> (see: publishing workflow).

Leonidas-from-XIV commented 1 week ago

What's going on here?

This is because the OCaml compiler is not relocatable so between the compiler being built and getting installed it gets moved which breaks it. ocaml has a shebang and the path points to the wrong directory.

I can't use the binary distribution, so what do I need to build to get a dune with the new package management features enabled?

You can set DUNE_CONFIG__TOOLCHAINS=enabled which enables a specific fallback for the compiler where instead of _build it will get installed in $XDG_CACHE_ROOT/dune/toolchains/....

And why aren't they enabled in the mainline dune?

It's because it breaks the model of Dune to create build artifacts in _build. As the relocation patches are hopefully close to completion @gridbugs implemented a workaround for the compiler specifically. But in general we don't want Dune to have special-cases for packages that it needs to build, hence it is an opt-in feature that we hope to be able to drop as the relocation patches roll out to releases.

avsm commented 1 week ago

The relocation patches will take some time to land (cc @dra27), but don't we also need to support reasonably recent previous non-relocatable compilers? Dune has many such special cases for OCaml support, so activating this by default seems reasonable with a rule (@rgrinberg?).

That's one workaround. What are the other two environment variables for, and why aren't those activated by default?

rgrinberg commented 1 week ago

I think it's fine to enable this by default even if it does some behavior that users aren't used to.

I would personally not care if dune was limited to newer versions of the compiler for package management if it allowed us to get rid of this hack.

Leonidas-from-XIV commented 1 week ago

don't we also need to support reasonably recent previous non-relocatable compilers?

From what I was told the plan was to backport the patches to older compilers, so with a few patch releases this would be a possibility (and last time I looked at them @dra27 had a fairly remarkable infrastructure to patch a lot of current and older OCaml versions so it seemed fairly realistic).

Leonidas-from-XIV commented 1 week ago

That's one workaround. What are the other two environment variables for, and why aren't those activated by default?

DUNE_CONFIG__PKG_BUILD_PROGRESS=enabled displays progress when downloading and building OPAM packages which changes the behavior of the default output; DUNE_CONFIG__LOCK_DEV_TOOL=enabled enables automatic locking and installation of ocamlformat I believe (cc @moyodiallo).

avsm commented 6 days ago

Thanks @Leonidas-from-XIV. Why can't both of those be enabled by default?

maiste commented 6 days ago

They are not able by default because they are specific to package management, which is not a stable feature. We are still iterating over these features. We enable them by default using the compile time flags on https://dune.ci.dev (the dune preview) as one of the goals of package management is to be able to use dune without the need for opam.

samoht commented 6 days ago

How do we expect people building from sources to keep track of these feature flags?

Would it be possible to have a branch with all these flags enabled by default or a single DUNE_CONFIG__DEV_PREVIEW flag to allow all the preview ones in one go?

Also, while it's definitely super important to be able to iterate on some unstable features, I think it would also be useful to merge the noncontroversial improvements (like the PKG_BUILD_PROGRESS one) directly to minimise divergence from the main branch. For the other ones, it's also probably fine to turn them on in the main branch and call these unstable features. Dune already has a bunch of them.

maiste commented 6 days ago

The flags and environment variables are described on the https://dune.ci.dev website in the FAQ section. Maybe, having a distinct section outside the FAQ could make it clearer. @leostera, WDYT?

I don't have any opinion on this. It would mean that people have to know that they should use this branch. Also, the branch should be rebased regularly to keep track of the other changes. It could be done automatically. For the flag, I don't know if it would be doable in the compile time flag context. @rgrinberg, do you think there would be a way to do this?

There is no divergence, they are actually merged in main. The compile flags protect the behaviour of happening if you don't explicitly ask for it, but they are available on the main branch. As @leostera mentioned, we can't activate them by default because they change the behaviour of some commands (dune fmt). That is why we have the flags and environment variables: it avoids users of dune through opam to be "tricked by" unstable features they don't expect to happen.

avsm commented 6 days ago

Thanks for all the answers above. Regarding:

The flags and environment variables are described on the https://dune.ci.dev/ website in the FAQ section

I find a description of the configure flag on that site, but I've not heard an answer on why we need these, and why they're not mainline in dune. Dune already has a well established versioning mechanism for rules:

There now appears to be a third versioning mechanism, and I don't understand what's diverging here. Is it that these features activate some new functionality in the dune.exe client that break existing workflows if they're active? What breaks if I activate DUNE_CONFIG__LOCK_DEV_TOOL? If nothing, then why not enable it by default? Similarly, what breaks in the output mode if DUNE_CONFIG__PKG_BUILD_PROGRESS is activated? Surely it only makes a difference if package management is activated.

It sounds to me like the 'experiment' here with package management is to activate a new mode in the client. This could also be exposed as:

This wouldn't break anyone's existing workflow, it would avoid having forking dune. The problem with the current scheme is that it forces people to use a particular binary build of dune, which is particularly problematic for OS packagers. They have to decide whether or not to activate this experimental feature, and they will all take difference decisions (Homebrew vs Debian packaging policies), and we'll diverge. If instead Dune remains a unified binary experience with command-line flags and the above logic to activate or deactivate these features, life remains happy for all parties concerned.

I am very excited to try these features out properly. @dra27 and I were reminiscing that we talked about "Bob the builder" seven years ago now in ICFP 2017, so I'm hugely appreciative of all the hard work that's gone into pulling this off. But I don't want us to stumble at the last few steps of the release and rollout, which is my concern now.

(An ergonomic aside: why have we invented a new two-letter verb for the lockfile? go get && go build is very memorable -- why not turn dune pkg lock into dune get and simplify things for adopters from other languages?)

maiste commented 6 days ago

I'll try to address most of your questions, but I encourage other people involved in the development to complete it if I'm not precise enough or miss some information :smile:

The config flags are new in Dune (see PR10843), that's why they are not mainline. The (lang dune ...) exists to make sure you use the proper semantics in the dune language (at least corresponding to your version). On the other hand, the using keyword is here to add functionalities related to the language, not to dune itself (support mdx, etc). The package management brings some important changes to the way Dune interact with the rest of the tooling. Contrary to the previous mechanisms, we need to have a mechanism to make sure Dune is able to handle two different behaviours in the context of unstable features. We can't use the lang stanza because it wouldn't support "two options" and we can't use using because it is supposed to be use for an extension of the language, not for a different behaviour of dune.

Before anything else, it's important to note that most of the behaviours of package management is already made under the assumption that we detect dune.lock. Indeed, we already take into account the fact that if the user is running dune pkg lock they accept to use package management willingly. That being said, the compile flags are here to test unstable features (actually only features related to dune package management, but it could be any other unstable feature). If users wants to use them, they have to know it's at their own risks and that the behaviour will change in the future. One of the biggest strength of dune is its stability, this is not something we should break easily: hence the config time flags. Once the feature is stabilized, the config flag associated will disappear.

To address your three points, this is the same reasoning:

This is where there is a big misunderstanding: the features protected by a config time flag are unstable, they are not meant to be used apart in the context of a preview. We don't want to ship features that may be obsolete in a week because we see beta users are not liking the way it works. Also, in the current context, people can still package dune if they want, but the part protected with the flags are unstable. We don't expect the entire community to move away from opam the next week. Dune should keep working the same way in the context of opam, and we need to be able, as dune developers, to add and test features. The config time flags are a good compromise in this context.

If you have a version of dune with the config time flags not activated, you can still get them with environment variables. It's easier to remove than a flag people have been used to write.

About the new verb, it's the same reason as above: when have our dune pkg ... system where we can "break things". If we see users using a particular command a lot that is stabilized, it can move to top-level. But it will be driven by developer requirements and stability :+1:

To summarize, config flags are here to protect unstable features will we improve them until they make their way to mainstream dune (without the flags). Dune package management is intended to be used without opam, but it will need some time to be stable (using community feedback). That's also why we advertise it in a testing / preview way, not a stable one :smile_cat: We need to give time to this functionality to mature! :+1:

avsm commented 6 days ago

Thanks @maiste, I could use a few more clarifications. The dune.ci.dev cite implies that these are configure-time flags:

--enable-toolchains --enable-pkg-build-progress --enable-lock-dev-tool

The above thread implies they are runtime flags (via the environment variables) and will work with the same dune binary. Which is it?

Secondly, my specific proposal above is compatible with experimental features:

  • An explicit dune build --pkg and dune build --no-pkg flag which forces the activation and deactivation.
  • An implicit dune build which uses package management if the lockfiles are present (since the invocation of dune pkg lock indicates the user is willing to optin.

Even when the feature is stable, it's essential to have a way to turn it off for determinism and network-isolated environments or CI systems. So having explicit --pkg and --no-pkg flags are forwards-compatible ways of enabling or disabling this functionality.

I don't understand your third point about this flag interfering with the idea of package management. Might you have a concrete example so I can understand this better?

Leonidas-from-XIV commented 6 days ago

They are both. The configure time flags set the default to enabled (from their default of disabled). The environment variables can then be used to enable or disable these at run time. The only thing the developer preview does that differs from the default build on main is that it swaps the default value.

So you can set them either at build time, or if you didn't, at run time. Conversely, you can also set DUNE_CONFIG__<var>=disabled on the dev-preview to make it work like a binary built directly from main.

maiste commented 6 days ago

About your last question, my point was about distributing Dune Package Management through an opam package. If we want this dune version to be standalone, there would be a contradiction to use opam to install it.

samoht commented 6 days ago

Is there a way for an user to know if these feature flags are enabled?

Reading this whole thread quickly, I think the main concern is "forward-looking", i.e. how we will get out of that "Developer Preview" stage without confusing anyone. Suppose distros start to package and build the preview with different configuration options (and they'll probably do it shortly given the very positive feedback early users have given). In that case, it'll become very hard to offer a unified DX on all platforms, and debugging bug reports will become pretty difficult.

So the simpler we can configure those builds (by re-using existing mechanisms when possible, by allowing users to inspect and report their configuration, etc.), the easier it will be to triage bugs and release something in a couple of months.

Leonidas-from-XIV commented 6 days ago

So having explicit --pkg and --no-pkg flags are forwards-compatible ways of enabling or disabling this functionality.

One of the reasons why we didn't want to do this is to focus on doing the right thing by default, no options necessary. If you have a lock directory that means that you want package management (there are some minor interactions with installation, so probably dune build -p will disable package management when building in OPAM).

If you want to avoid package management in some configurations this can easily be done with rm -r dune.lock. Or we can add some options to the dune config, dune-workspace or so to disable it for people that absolutely don't want to use it despite the projects they work on having lock directories.

Suppose distros start to package and build the preview with different configuration options

I don't think distributions will be packaging any non-default options. It's always the conflict between distributors and maintainers, but if distributions mess up the code it should be the distributions that have to deal with the mess. As such I would really implore distribution maintainers to reach out the the projects before setting experimental flags. We can probably change the descriptions to say something like EXPERIMENTAL: ENABLE AT OWN RISK.

maiste commented 6 days ago

Currently, there isn't a way. However, @gridbugs made a proposition about having the flags enabled describe in an enhanced dune version (for example --verbose or something like this). And with this it would be definitely easier to debug, I completely agree!

About the distribution, I don't expect package maintainers to ship the main branch version of dune (apart from people in the AUR, but that's the concept :smile: ). They would ship dune in stable version (3.17 being the next). Thus, they would have access to all the stable features we have configured without the flags. Until that, we have the opportunity to provide thought the Dune Developer Preview some features and "battle test" them. When they would be considered as stable they would make it to the regular distribution flow (deb, rpm, etc) using the regular release cycle.

I agree that having tools to be able to debug behaviours will save us time at bug triages.

leostera commented 6 days ago

@samoht: how we will get out of that "Developer Preview" stage without confusing anyone

There's 3 things here:

  1. The developer preview of dune
  2. The stable opam release of dune
  3. The stable standalone release of dune

The Developer Preview is a nightly distribution channel. It won't go away, but some features will be promoted into the stable Dune release over time. For example, we're in talks with the Odoc team to ship the new Odoc 3 enabled by default. That's not stable, but it will be with time, so then we'll promote it over to the stable release of dune.

So this distribution channel has a few goals:

The Stable OPAM Release of Dune will continue to be what it is today, but some unstable features that are compatible with the opam-world will be enabled over time.

The Stable Standalone Release of Dune is the next step. This will be a single binary that can handle all the workflows as intended by the OCaml Roadmap, minus the unstable features. This is the release that distro packages should be looking out for.

We'll be looking into this release in Q4.

Example of stabilization for a new dune create command

We'd start by carving room in Dune for this new command, having it disabled by default behind a --enable-templating flag, and prototype it.

It'll make its way to the Developer Preview, so it'll be only enabled there.

We'll show it to the community, ask people to try it, iterate on it. Wait for feedback, bugs.

Once we're confident this is a good solution, we'll stabilize it. This may include refactors, more tests, etc.

Then it'll make its way to the stable releases, unless it relies on Dune package management, in which case it'll only be available for the standalone Dune.


That's how we see this working.

avsm commented 6 days ago

One of the reasons why we didn't want to do this is to focus on doing the right thing by default, no options necessary.

The problem with doing the right thing by default is that you always select the wrong thing in some cases. With docker, for example, we follow the scheme I described above. The client has some default choice (which is set arbitrarily -- it can be defaulted differently, overridden by config choices, etc), and there is always a --option and --no-option to set it explicitly.

So in CI scripts and similar, an explicit choice can be made so that the behavior never changes. And by default, the user gets a good experience if they type in docker run by default. The same also holds with most other CLI tool: see, for example, the cargo configuration scheme where the levels of overrides are described.

Not having an explicit behavior option is a huge problem in practice, as automation scripts in particular are not forwards-compatible.

I still don't understand why we have 3 feature flags... :-)

maiste commented 6 days ago

It doesn't mean there won't be a flag to activate deactivate the feature if necessary. It's still the case in Dune (see the PR about the cache where we provide a different default, but it can always be changed). You are confusing two distinct things here: flags and config flags. The first one is to specify and change the behaviour based on some options in the normal Dune behaviour, the second one is to protect regular Dune users from unstable features. These unstable features come with their own flags and configurations.

So, here there is no question of not having an explicit behaviour in the context of the stable distribution of Dune. There will still be a default and a configuration for the none default, as it has always been in dune and has it is in the rest of the world.

We have three config time flags for the three features we are currently testing in the Dune Developer Preview:

At some point, we will remove some of them because we consider the feature to be mature enough to not be considered as unstable. In this case, the feature will belong to Dune stable. We might add more to protect more features. For example, dune create as mentioned by @leostera will have its own flag (e.g --enable-templating) and this flag will exist until the feature is removed (because it is considered to not be good for stable) or included into stable. They are not meant for stable users.

Leonidas-from-XIV commented 6 days ago

The problem with doing the right thing by default is that you always select the wrong thing in some cases.

Sure, but in this case you can turn it off, just like you can turn formatting off despite it being enabled by default since 2.0 or so. The risk of breakage is fairly low, as the project would need to ship with a lock directory for dune to "accidentally" enable package management. Even in these unexpected cases then dune would build the specified dependencies, which will leave users with a successful build, so from a user perspective the upgrade to use dune package management transparently "just work".

In the rare case where there is a CI script that can't access the network, that script needs to be adapted, or stay with an old Dune version. Forcing every human user to always add opt-in flags to do the right thing just because of a handful of existing scripts might break when using a new version of Dune is putting the needs of a number of CI scripts over the very tangible and near-universally welcomed UX improvements for normal developers.

samoht commented 6 days ago

There's 3 things here:

1.The developer preview of dune 2.The stable opam release of dune 3.The stable standalone release of dune

Thanks @leostera, this is much clearer now. I suggest we start by explaining this on https://dune.build and then design the distribution mechanism around these 3 channels (instead of constantly changing feature flags). If you look at how Docker Desktop does it for instance: they distribute the same binary but have a very simple and deterministic way (not using environment variable!) to change the distribution channel (and for something similar to 1. to enable/disable specific features). In any case, designing those channels should be a conscious feature development - proper design and technical discussion on how all of this is supposed to work over time has to happen in the open.

Also, I would suggest we do not put 1. on the front page of ocaml.org as the recommended way to use OCaml if the aim is to roll out 3 shortly (which seems a better candidate for being the "front-end of the OCaml Platform").

maiste commented 6 days ago

Do you have a link explaining / showing how the Docker Desktop is doing it (we can study it)?

maiste commented 5 days ago

Few remarks regarding the distribution channels.

First, we currently have 3. and we can have 1. almost for free (by changing the target in our action build). The system to build and distribute 1. was not design to scale and should definitely be rewritten to be more efficient and sustainable (in terms of deployment and space). It was written so we can test and ship fast unstable features without breaking Dune stable (with the help of the feature flags). The infrastructure was not meant to stay forever. I completely agree, this is a good reason to take a step backward and design a better system, but for the infrastructure, not for the way it was design internally in Dune.

Indeed, as @leostera explained, we will keep distributing 2. in the same way as it was in the past: through opam releases. As we only expect for 1. to be exposed to beta testers and OCaml hardcore enthusiasts, I don't see why the config flags are an issue. Normal OCaml users won't see it. It won't change the installation with opam, nor it won't change the behaviour of the stable standalone version (as we would have removed the config flag guard for the feature considered stable). It took around 2 months to get the config flag feature available in Dune (through various open discussions, open meetings and PRs) with the consensus of most of the maintainers. I must insist that this is only available for a test purpose, to give the possibility to rollback features that are not good for the community (check through interviews, bug reports, etc) and to move forward fast without breaking stable Dune. It might be confusing here because the flags are protecting important features, but it's here for a reason: it is not stable. The best example is the dev-tool feature that we will discuss in the next Dune dev meeting (open to everyone).

About ocaml.org, the goal (mostly with a banner in https://github.com/ocaml/ocaml.org/pull/2706) is to advertise the developer preview, and we must be clear about the unstable / beta / test aspect of it. However, we should advertise it to have more beta testers so we can make these features as smooth as possible before merging them in stable dune (by removing the flag and removing the guard in the code). I agree it should be transparent to the community and described clearly somewhere (i.e on https://dune.build).

Note this is the same mechanism as cargo does to test unstable features (cf cargo contributor guide):

Most new features should go through the unstable process. This means that the feature will only be usable on the nightly channel, and requires a specific opt-in by the user. Small changes can skip this process, but please consult with the Cargo team first.

The only differences is that we make them opt-in at build time, and we activate all of them by default in preview mode (to make the experience as similar as it would be in a stable mode).

dra27 commented 5 days ago

The only differences is that we make them opt-in at build time, and we activate all of them by default in preview mode (to make the experience as similar as it would be in a stable mode).

Which I think is the nub of @avsm's original question of why we have 3 feature flags at all!

If the Developer Preview / nightly is intended for experienced OCaml users, then making that binary-first is an eyebrow-raising decision. There would seem to me to be quite a few experienced OCaml users who are both used to building from source by default and potentially suspicious as to how release binaries have been constructed.

I'm not familiar with the internals of nix, but I gave up following through the various files between tarides/dune-binary-distribution and ocaml/dune to work out what concrete commands the workflows were running to build the preview binaries, but it was pretty trivial to construct dra27/opam-repository#dune-dp (it's somewhat reminiscent of avsm/opam-sync-github-prs, used for the compiler for a while).

That gives a fairly simple source-based workflow for Dune nightly:

The first time:

$ opam switch create dune-dp --repos=dune-dp=git+https://github.com/dra27/opam-repository#dune-dp dune-dp
...
$ eval $(opam env)
$ dune --version
"Dune Developer Preview: build 2024-09-25, git revision
1a8b43f449cf00996281b5c73b2a0f3400817a00"
$ ls -l $(opam var bin)
total 24656
-rwxr-xr-x 1 dra dra 25244288 Sep 28 11:12 dune

That took 90 seconds on my laptop (including syncing the repo and building the compiler). Getting the latest nightly is just opam upgrade, which took 25 seconds.

Following @avsm's original example, we can more quickly get to actual feedback on Dune p.m. and issues, in particular the fact that moby/vpnkit's build calls opam and that the cache was slower than certainly I expected following a git clean (with dune cache enabled on stock Dune an opam switch, the project rebuilds in 7 seconds following a git clean, with Dune d.p. the project spent at spent roughly 60 seconds reconstituting the "switch").

dra27 commented 5 days ago

About your last question, my point was about distributing Dune Package Management through an opam package. If we want this dune version to be standalone, there would be a contradiction to use opam to install it.

I think you conflate demonstrating Dune package management, which certainly wants to be done without opam around, with actually using and testing it, where users are highly likely to have opam around for all their other OCaml-based work. If users of the nightly build are likely to have opam it seems somewhat daft to ignore it.

leostera commented 5 days ago

If the Developer Preview / nightly is intended for experienced OCaml users

It isn't. It is intended for users willing to test new features. In fact, most of the people testing it right now are entirely new to OCaml. And we want them to test it, since we're focusing on removing friction from onboarding onto OCaml.

rgrinberg commented 5 days ago

Lots of good comments in this issue, but I'm afraid I won't be able to answer everyone individually. I'll just summarize the discussion and add some motivation for why things are the way they are.

Some package management features are still rather experimental and unpolished. I would have preferred if the distribution of dune would not enable features for the common user until they're more stable.

The above goal is directly in conflict with gathering feedback from beta users however. Therefore, the binary distribution is a way for early adopters to experiment with dune with the expectation that features can be added/removed/modified without notice.

Eventually, the end goal for all of these experimental options is to go away. Nobody apart from dune's dev team should be messing around with these flags.

Regardless of the above, I think we should revisit some of the defaults for the flags that we have. Some of these experimental features have seen a lot of iteration and testing since their introduction. I don't see what's wrong with enabling them.