ocaml / dune

A composable build system for OCaml.
https://dune.build/
MIT License
1.6k stars 397 forks source link

[RFC] Package subsetting support for vendored code #7058

Open Leonidas-from-XIV opened 1 year ago

Leonidas-from-XIV commented 1 year ago

Desired Behavior

When other Dune projects are vendored, I would like to be able to define which packages are visible to Dune. This is currently supported using the -p flag to dune build when building/installing with OPAM: in that case Dune only builds (and OPAM installs) only the artifacts that are part of the packages specified via the -p flag.

This allows to use multiple third party libraries that contain the same package name (e.g. dune packages that dune vendors) or multiple versions of the same package (e.g. libfoo-1.0 and libbar-1.0.1 from the same dev-repo, sharing a dune-project but installed in different versions).

Unfortunately, the same is not possible when vendoring code, as in such case dune aborts because the same packages exist multiple times and it doesn't know which code to pick.

Example

Currently to designate a directory as vendored we can use

(vendored_dirs *)

In this RFC I would like to suggest an extension to the vendored_dirs stanza that would allow specifying additional metadata on the directories to be vendored:

(vendored_dirs 
  (foo (packages foo))
  (bar (packages bar))
  *)

Would in such case use the directory name foo and only build the packages that are foo, bar would do similarly, while the other directories would have all their packages evaluated.

Other solutions

I've attempted to write a tool to rename packages to random names (e.g. package bar in dir foo would become bar-ab43ec and foo in dir bar would become foo-e21fce, but unfortunately such a tool is not possible without running into a lot of problems:

While these issues are solvable with a bit of elbow-grease, the thing that broke the camels back was

As such, my suggestion is to implement this extension as part of Dune, which has all the information to subselect packages in a dune project already (with -p) so the extension is most likely rather limited in scope.

Leonidas-from-XIV commented 1 year ago

In a way this is similar to #3734 as it arises from the same issue, dealing with large duniverses, but this suggestion seems to me like a somewhat simpler solution as it also allows multiple versions of the same source tree, but at the moment my knowledge of dune scopes is too shallow to say for sure, hence I hope for suggestions and ideas on how to solve this problem.

rgrinberg commented 1 year ago

This proposal is a decent workaround, but I don't think I understand the semantics yet.

E.g. suppose that we have:

(library
 (name foo)
 (libraries private_baz)
 (public_name bar))
(library
 (name private_baz)
 (public_name baz))

And we try to remove the package baz. How will private_baz be resolved?

Leonidas-from-XIV commented 1 year ago

Isn't this the same problem as doing dune build -p bar[1] on this project, i.e. a misconfiguration/impossible request?

[1]:

File "dune", line 3, characters 12-23:
3 |  (libraries private_baz)
                ^^^^^^^^^^^
Error: Library "private_baz" not found.
-> required by library "bar" in _build/default
-> required by _build/default/META.bar
-> required by _build/install/default/lib/bar/META
-> required by _build/default/bar.install
-> required by alias install
rgrinberg commented 1 year ago

i.e. a misconfiguration/impossible request?

Nope, the above is correct. When baz is installed, private_baz will resolve correctly to it

Leonidas-from-XIV commented 1 year ago

Yes, but I don't see how this this would be a different case?

If we assume a hypothetical configuration like this:

(vendored_dirs 
  (example-bar (packages bar))
  (example-baz (packages baz)))

Then the private_baz dependency of the example-bar folder will resolve to the private_baz library from the example-baz folder where we enable the baz package.

If you don't specify it then it fails, but that's the same as if depending on private_baz without installing baz (through e.g. OPAM). The only difference is that dune can't assume it is installed and fail if not, it needs to try to find and build it in the vendored_dirs.

rgrinberg commented 1 year ago

That sounds fine. I'm just trying to understand the semantics. As far as I can see, this feature is basically like --only-packages but the fully qualified package path can be provided.

My original proposal can be used to fix other issues as well, so I'm not yet certain if this intermediate step is worth doing.

Leonidas-from-XIV commented 1 year ago

Yes indeed, this is more like --only-packages (more than it is -p since it is not about release mode at all). The reason I am suggesting this is that this allows us to vendor parts of source trees in a very similar way as OPAM allows installing parts of source trees.

It is for sure less comprehensive than #3734, but I fear just the solution proposed in #3734 is not enough to solve the problem we have in opam-monorepo, since dune needs to somehow know where to take a public library from. In the Menhir example there it is clear that fix should come from vendor/fix and not vendor/menhir/fix but it will still break down in this case:

vendor/
  foo-v1/
    foo.opam
    bar.opam
  foo-v2/
    foo.opam
    bar.opam

as dune won't know which vendored version of the same repo is supposed to provide the foo public library and which one is supposed to provide the bar public library.

Thus I don't think a fully inferred solution is possible. In the case of opam-monorepo we infer which package to use for which vendored source tree from the -p flag to dune build from the OPAM file that the solver has returned to us as a solution. Not necessarily elegant, but it works in practice because build fields in OPAM are of a pretty standard format.

rgrinberg commented 1 year ago

Yes, I agree that a fully inferred solution isn't possible. Anyway, this proposal sounds reasonable but we need a little more requisite work and a better idea on how it's going to work with other features that we're proposing before tackling this. I would suggest that we pause the discussion until we a prototype of the package management ready before we can be more confident about what to do about this.