ocaml / dune

A composable build system for OCaml.
https://dune.build/
MIT License
1.64k stars 409 forks source link

Libraries vendoring other libraries are unusable #3335

Open gares opened 4 years ago

gares commented 4 years ago

Expected Behavior

Vendored libraries should be linked in, and probably name mangled

Actual Behavior

Vendored libraries are not installed nor linked, as a result one cannot build a binary using the library

Specifications

ghost commented 4 years ago

That's slippery slope. In theory, yeah it would be nice to allow that. In practice:

  1. it will be difficult to make this feature general. What do you do for C code for instance?
  2. if the same library end up linked up twice in the same executable, there is no guarantee things will work Ok. For instance the two instances might try to grab the same shared resource
  3. if we start vendoring libraries in other libraries, we are going to end up with many copies of the same library inside executables, i.e. OCaml executables are going to get bloated
gares commented 4 years ago

Then please error out when a library that was compiled using a vendored library gets installed.

Said that, to me the scenarios you depict are clearly a potential problem, but not necessarily a real one (how many libraries have C code or lock a file in their initialization code). As a last resort dune could even detect double linking (store somewhere, maybe in META, what is vendored).

But I'm not the one writing the code, so I'd be fine with a simple error message.

rgrinberg commented 4 years ago

Name mangling is something that many of our users do manually and it currently adds a ton of maintenance when accumulated. So I am in favor of making it easy for users to do it. I also agree that it does not make sense for dune to recommend it, so it would have to be an orthogonal featur to vendored_dirs.

ghost commented 4 years ago

That's true. However, doing it in dune will add a lot of complexity in dune itself. Which is not great either.

Before we change dune in this direction, I'd like to understand better:

Currently, I only have vague ideas about these three points.

gares commented 4 years ago

why do users want to vendor libraries in other libraries?

I want to have my project be compilable even if a dependency is not there. If the dependency is not present I untar a copy of the library before running dune, and vendoring would do the magic.

how do they manually do it now?

The file depending on X.foo actually uses My_X.foo, then I use select to pick a my_X.ml that either includes X or Vendored_X. I have a non-vendored copy of X renamed to Vendored_X that gets linked in if needed. Seems to work.

what are the alternatives?

I'm all ears ;-)

ghost commented 4 years ago

I want to have my project be compilable even if a dependency is not there. If the dependency is not present I untar a copy of the library before running dune, and vendoring would do the magic.

That seems like the intended use case. Why do you need to do anything special though? When resolving a library foo, Dune will first look it up in the current workspace and only look for a pre-installed library if it cannot find it locally. So merely untarring the dependency should be enough.

The file depending on X.foo actually uses My_X.foo, then I use select to pick a my_X.ml that either includes X or Vendored_X. I have a non-vendored copy of X renamed to Vendored_X that gets linked in if needed. Seems to work.

That doesn't seem great indeed.

gares commented 4 years ago

My project is a library, this is why just untarring does not work. Hence this issue.

ghost commented 4 years ago

I don't understand why it doesn't work. Could you give a bit more info about your project and the library you are vendoring?

ghost commented 4 years ago

Or (just trying to guess here), is the issue the following:

and you want to vendor a in b, and then vendor the whole thing in c. But then, if c already vendored a by some other vendoring chain, then you have a conflict. Is that correct?

Lupus commented 4 years ago

I can come up with some other use case for this. There is a fork of some library l' that has some fixes over upstream l and for some reason they don't get merged. We're building a library ourselves and want to depend on l', but it's not published on opam, and has the same name l in opam file. If library vendoring was possible, we would be able to just vendor l' into our library and deploy that to other projects via internal opam repository. What we have to do instead is maintain an internal fork of l' which changes the name in opam file to something different, publish l' under that name to our internal repo and depend on that in our library. This way we need to rebase l' each time it gets updated, which is not really convenient.

ghost commented 4 years ago

Why not replace l by l' in your internal repo?

Lupus commented 4 years ago

In our case, they are incompatible, and if other packages we might use depend on l, they won't build.

gares commented 4 years ago

I don't understand why it doesn't work. Could you give a bit more info about your project and the library you are vendoring?

I can compile my library, but then, once installed, nobody can link my library unless the vendored dependency is also installed. But the whole point of vendoring was to make my library usable in a context where the vendored library is not there.

My plan was to use vendoring to build the Coq-Elpi plugin on windows. There we don't use opam but we have a script compiling things by hand. To avoid compiling by hand the ppx stack, used by Elpi, I say: let's save somewhere the ppx-processed files (think deriving_inline), and vendor ppx_deriving.runtime. Things went fine, but once installed elpi (which is a library) could not be linked.

ghost commented 4 years ago

@Lupus why are they incompatible? Are you modifying the API of l'? I thought we were only talking about bug fixes.

@gares do you distribute your plugin as a cmxs or a normal OCaml library?

Lupus commented 4 years ago

@Lupus why are they incompatible? Are you modifying the API of l'? I thought we were only talking about bug fixes.

Well, l' has some features as well, and they require breaking changes to the interface. I won't say that I like this situation, but we have to work with what we have.

ghost commented 4 years ago

So, IIUC you want to maintain a fork of l. As usually done, you rename the forked project. However, it is difficult to rebase because of the opam files. Is that correct?

We have a plan to make it un-necessary to commit opam files in source code repositories. i.e. they will be generated on the fly. Would that make maintaining a fork of l easier?

Lupus commented 4 years ago

We want to publish a fork of l that is maintained by 3rd-party, but that does not change much, so yeah, we rename the forked project so that we can publish it to internal repository without clashes with upstream one.

Hm, what will be the source of information to generate opam files on the fly? AFAIK those are generated from similar fields in dune-project? If dune could generate valid opam file with all dependencies based on libraries used (i.e. without writing them explicitly) - we could just drop upstream files as we don't need fancy description etc. Objective is to make it installable with required dependencies.

gares commented 4 years ago

@gares do you distribute your plugin as a cmxs or a normal OCaml library?

as a normal library

ghost commented 4 years ago

Hm, what will be the source of information to generate opam files on the fly? AFAIK those are generated from similar fields in dune-project? If dune could generate valid opam file with all dependencies based on libraries used (i.e. without writing them explicitly) - we could just drop upstream files as we don't need fancy description etc. Objective is to make it installable with required dependencies.

It's the same as when you use (generate_opam_files true), except that you won't need to have them on disk. opam will call dune to generate them on the fly.

Lupus commented 4 years ago

That won't really help decrease fork maintenance cost, or at least I don't see how.

ghost commented 4 years ago

Alright, so I was trying to understand your use cases to see if they could be covered by other existing or planned Dune features. I still feel a a bit meh about library vendoring, in my mind vendoring is for executables and for libraries we should always strive to achieve maximum sharing.

However this subject keeps coming up, so maybe we should dig more into it at some point.

Now, this is touching a part of the code base that is way over the limit in terms of code complexity and maintainability: the part that deals with libraries. In particular the file src/dune/lib.ml. Every time I look at a PR touching this file, I'm having a difficult time following the change and I'm pretty sure I'm not the only one. So even if we decide to support library vendoring nothing will happen until this file has been properly refactored and simplified.

We will probably get to refactoring this part of the code base at some point, but right now this is not our priority. So such features will have to wait. But since Dune is open source, if you have spare cycles and would like help things move forward quicker, you are very welcome to and we are happy to help you do so.

Lupus commented 4 years ago

I still feel a a bit meh about library vendoring, in my mind vendoring is for executables and for libraries we should always strive to achieve maximum sharing

I fully agree with this being meh, but it's pretty convenient in some cases. Although not ideal approach, but we could avoid (or at least postpone) spinning of our internal repository, and have our platform library just vendor all of internal dependencies and services could just pin it at certain point and get away with it. Internal repo is way better approach, but it had some infrastructure costs to spin up - publishing packages from CI automatically, etc.

Probably we could come up with a tool that scans all dune files, and replaces libraries according to regexp (optionally public or private names, both in names and references), and add that as additional command to opam build instructions for a package. Not sure if such functionality needs to be supported by dune though.

ghost commented 4 years ago

That seems like a very good fit for dune-file :) I started this library precisely for this purpose: provide a simple way for users to do automatic transformations of dune files. It's not a super robust feature in the sense that the library doesn't know the grammar of dune files and only operates at the raw s-expression level, but it should be good enough for a lot of cases. In particular, this method is good enough to turn the internal jbuild files we use at Jane Street into the dune files that we have in the repositories of https://github.com/janestreet

The parsing side of the library is done, and I'm trying to find some time to add the "edition" part.

rgrinberg commented 4 years ago

Probably we could come up with a tool that scans all dune files, and replaces libraries according to regexp (optionally public or private names, both in names and references), and add that as additional command to opam build instructions for a package. Not sure if such functionality needs to be supported by dune though.

I think a much superior user experience can be provided if such a mechanism was provided by dune itself. Note that some references might occur in OCaml sources, so you will need to sed those as well. Much saner and more reliable would be to introduce automatic renaming in dune itself. I think that it could probably live externally if we had proper plugins, but for now, an experimental extension in dune is the best bet.

Lupus commented 4 years ago

Like "dune, please mangle my forked library so that I could use it, but no other package will see it"?

ghost commented 4 years ago

We just had a chat with Rudi. In fact, the code to support this in Dune might be that complicated. I'm still not too keen on the feature itself, but since we have a good mechanism for adding support for experimental features, we might as well just do that and see how it goes.