gleam-lang / gleam

⭐️ A friendly language for building type-safe, scalable systems!
https://gleam.run
Apache License 2.0
17.26k stars 715 forks source link

Requirement overriding #2899

Open pendletong opened 4 months ago

pendletong commented 4 months ago

I think it would be a useful addition to the language to have a mechanism by which requirements which conflict can be ignored.

I'm not sure the best way to go about this. Node has the concept of overrides (https://docs.npmjs.com/cli/v8/configuring-npm/package-json#overrides) which strikes me as a possible solution.

I think in my case I just wanted the build to override a single include - gleescript required filepath >=0.1-<1.0 but filepath 1.0 was needed for simplifile >=1.6.

lpil commented 4 months ago

Good idea! We'll have to decide exactly what the design should be

darky commented 4 months ago

Similar problem https://github.com/MystPi/pprint/issues/4

darky commented 4 months ago

npm resolves this problem by installing multiple versions of same packages, which suitable for every parent package, if it can't be flattened by semver of course. Via that behaviour ☝️ many jokes, that node_modules weights more, than black hole, but it's just works!

darky commented 4 months ago

Using of overrides also is choice, but if something will broken, it's your responsibility :)

lpil commented 4 months ago

We will not be duplicating packages, it's an awful design in my opinion. It causes build sizes and compile times to balloon, creates a lot of extra work for the programmers verifying and updating their deps, it would increase the complexity of the compiler, and it removes this pressure to make folks update their deps and to keep their deps tree shallow.

Gleam is all about encouraging the programmer to fix things instead of continuing with some worse accidental state.

darky commented 4 months ago

Totally agree about duplicating packages, but situation, when programmer can't install some package, because some dependency semver conflict exists in tree, is not fine totally. Moreover, with growing ecosystem, this problem can occur very often.

Need to do something, overrides section, for example. But it may broke compilation, may not broke, depends of your lucky.

Also we can try to ask community not broke semver and backward compatibiltiy in their packages, but I think it's utopia. Nevertheless, Clojure has very stable ecosystem and packages backward compatibility almost never broken.

Another suggestions?..

lpil commented 4 months ago

It wasn't an issue for Elixir, Elm or PureScript, ecosystems I was familiar with during their early stages, I'm sure it won't be a problem for us.

PgBiel commented 4 months ago

A related thought (I apologize if this is off-topic): it'd be nice if we could override dependencies in arbitrary ways: not only pin a package to a different version (which overrides the version used for all dependencies), but also being able to pin a package to a local path, or to another Hex package, or (in the future) to an arbitrary Git repository/commit.

Actually, it seems that pinning a package to a local path across dependencies is already possible today, as long as the local path has a package with the same name - I haven't tested too much, but it appears you can just declare, on your top-level project, a dependency on a local path package with the same name, and dependencies which transitively depend on that package will use your local path instead. However, that doesn't work if the package has a different name (regardless of it being on Hex or local), as a compilation error then appears.

The use case I have in mind is to be able to provide "polyfills" for certain packages to work in other platforms or targets (or even add certain extra functionality), and have dependencies use it; going further, being able to publish them to Hex would be awesome as well (gleam add is awesome). This could also be useful if a package is abandoned and a successor appears, but not all dependents have been updated to use it (which is somewhat similar to the version problem in the issue).

As for the syntax, there is some prior art from Rust's [patch], as described here. Our friends at Zed have some sample usage.

However, ours could be significantly simpler, as we don't intend to duplicate packages. I think something like this could work:

[dependencies]
pkg1 = "~> 0.34"
pkg2 = "~> 0.34"
pkg3 = "~> 0.34"

[patch]
pkg1 = "~> 1.0.0"  # override version for this package and dependencies
pkg2 = { path = "../my_replacement" }  # override with some local package for this and deps
pkg3 = { hex = "pkg_pkg", version = "~> 1.0.0" }  # override with some Hex package for this and deps

(EDIT: Updated suggested syntax for version override in [patch])

pendletong commented 4 months ago

If gleam is not to duplicate packages then this simplifies my thought process.

I think the idea of a flat [patch] is compelling for me. If there is only ever one version of a particular package then all that is needed is to record a version that will override any specified/derived version.

From a beginner's point of view would it be possible to add a --override flag to the add command which would add the required 'patch' to allow the package to install (whether this up or downgrades the dependency's version). If this is not correct then they could re-add the package with the patched dependency (with --override again) to force the version to that one.

The other thing it would be useful to do when the --override is specified is to do compatibility checks to determine whether it would actually be a patch that broke the use of the api in the overridden package. (This might also be a useful pre-build phase for when a patch is manually introduced rather than outputting actual compile errors)

Anything more complicated than the --override method would have to be manually tweaked in the [patch] section.

As a side note to this, I only discovered this when I noticed that simplifile had updated to 1.7 so attempted a gleam update but nothing happened. I think it would potentially be useful to list what updates were available but not possible with the existing dependencies. I had to remove and re-add to figure out what was happening.

lpil commented 4 months ago

Let's have any additions to gleam add be a different issue and focus on the resolution mechanism and the gleam.toml API for here. Currently gleam add doesn't support anything but the most simple case.

PgBiel commented 4 months ago

Taking a look at some prior art (so we can better decide the semantics for Gleam):

  1. Rebar3 allows overriding dependencies' configurations in arbitrary ways, which should include overriding transitive dependencies as well: https://github.com/erlang/rebar3/blob/b28f3b55e046d9b4bdfc1aca34d7d6cc9f9ae502/rebar.config.sample#L192

    While that seems nice, IMO that'd be a bit too much for an MVP of this feature, as it's possible we could regret it later, so it might be better to experiment with simpler capabilities first.

    I can't find it on the official rebar3 docs, but some sources claim that even dependencies can use this override feature, but the topmost configuration wins. Not sure if that's true, but that'd be different from Rust's approach, which I think ignores dependencies' [patch] sections.

  2. Mix seems to allow adding an override: true flag to the top-level project's dependencies, so that they will always override transitive dependencies on the same packages: https://hexdocs.pm/mix/1.16.2/Mix.Tasks.Deps.html#module-dependency-definition-options

    That's definitely a simpler approach and could be considered, although that could lock us out of more advanced patch features in the future (or not, as we could in theory have both an override option for simpler needs and a [patch] section for more advanced needs, if necessary, but I guess that wouldn't be fully optimal).

  3. NPM (through package.json) seems to allow an overrides section which seems to have a lot of flexibility. In particular, it allows pinning a package to a version regardless of transitive dependencies, but also allows changing that version override only for a specific dependency (i.e. you can pin "foo" to version 1.0.0 only as a dependency of "bar", but not as a dependency of "baz", for which it is not overridden). However, the latter isn't very useful for us, as we plan to not allow duplicates of the same package. (Reference: https://docs.npmjs.com/cli/v8/configuring-npm/package-json/#overrides)

As I see it, for now I think a [patch] or (perhaps more user-friendly name?) [overrides] section in the root project's gleam.toml might be the way to go, with the following semantics:

  1. Specifying a package version through the root project's [patch] silences any "incompatible requirement" (i.e. transitive dependencies' versions aren't matching) or "package name doesn't match" (i.e. depending on a local package or Hex package with a different name in its gleam.toml) errors - the package's source is simply fully overridden.
  2. The manifest.toml is updated to reflect the patch (packages' versions and sources are changed to match the patch). I'm not sure if it is necessary to change the manifest.toml's format to include a patched: true field or similar - I think the compiler should be smart enough to figure that out, but that could be done too.
  3. Regarding dependencies' [patch]: In principle, having only the root-level [patch] work makes sense, but I can see how it could be useful to allow dependencies to [patch] as well (unless a package at a higher level in the dependency tree overrides that). It's not immediately necessary, however.
PgBiel commented 4 months ago

I've made a proof of concept of the [patch] design at PR #2981 . Note that said PR is by no means a final implementation or design, and I am willing to adapt or otherwise close the PR as discussion evolves. The point of the PR is mostly to allow us to experiment with the design in real world settings, which I believe should aid in the discussion. Even if the envisioned design changes significantly, the PR should provide a good foundation for follow-up PRs.

So, with that said, for those interested, feel free to compile my branch from source and give it a quick test in your projects, according to the example in the PR description. I think this should provide good insights regarding what doesn't work and should, or what shouldn't work but does, or what should be changed in general!

lpil commented 4 months ago

Before we look at other solutions we need to look at the problems. What are all the issues we looking to solve here? Is it just ignoring a constraint specified by some dependencies? It sounds like the "patch" system could do more things, but I don't know what they are.

PgBiel commented 4 months ago

That's true! Besides the problem of bypassing constraints, I also have in mind the idea of being able to adapt dependencies to your environment or target, which is something I need, and requires an overriding system in place. Complementary to that, I also thought of the idea of being able to replace abandoned dependencies by a new and maintained dependency (maybe your own locally maintained version).

PgBiel commented 4 months ago

With that said, some of this functionality is already possible today - top-level dependencies appear to have priority over transitive dependencies (at least when versions are compatible), so you can, for example, point gleam_stdlib to a local path and dependencies will use it.

The Core Problems

So I thought a bit further and we have basically two problems to solve here (let me know if I missed something):

  1. You should be able to override transitive dependencies' versions when they aren't compatible. That is, have Gleam not complain when there is an incompatibility, and just use the top-level dependency or patch version.

  2. You should be able to replace a package with another. Currently, Gleam appears to also check, for example, if you write gleam_stdlib = { path = "../stdlib" }, if the package name at ../stdlib/gleam.toml is also gleam_stdlib. In this regard, we should be able to silence that error as well and just take the source at the given location if it is a valid Gleam package. (This would allow pushing overrides to Hex (given Hex can't have two packages with the same name, in principle), which would allow us to eventually add some config syntax to patch a dependency with another Hex package.)

Possible Solutions

Therefore, if we only aim to solve the two problems above, a simpler implementation could become possible (my POC is a bit more general than this idea, so it could be trimmed down to solve these problems in particular).

Solving Problem 1

  1. We consider patches to local/Git packages as dependencies (at least in principle) and provide them early (just like we provide top-level dependencies nowadays).

  2. When there are patches for package Y, disable the explicit version compatibility check for Y before pubgrub resolving.

  3. While pubgrub resolving, any packages depending on Y will accept any versions of Y if it is a local/Git package (we replace their requirements with "> 0.0.0", thus allowing any versions), or with the version required by the patch (if patched with a Hex package). This would allow overriding versions while also ensuring the patch specifies a valid version from Hex.

  4. While providing local (and Git, in the future) packages, if they are patched...

    1. ...with a Hex package, we don't provide the package and instead wait for it to be downloaded from Hex later (as is what happens with any packages that weren't normally provided).
    2. ...with another local or Git package, we just provide that instead (if they weren't provided earlier, because it's e.g. a transitive dependency of a patched dependency we didn't analyze yet).
  5. If any Hex packages were patched with local (/Git) packages, today's behavior remains as they were already provided earlier, so they aren't downloaded at all.

  6. Finally, when fetching packages from Hex, we will be using the patched version, as we will have informed pubgrub that all packages depending on Y will depend on the patched version of Y instead, so it will naturally use that version, if possible.

Solving Problem 2

This needs a bit more R&D from my side to confirm, but some initial observations while developing my POC were:

  1. Assuming the problem 1 solution is applied first, we can begin by disabling the package name check while providing patched local/Git packages. Everything else remains the same, e.g. build/dev/<target>/<package name before patch> will still exist but use the source of the post-patch package.

  2. If a package was patched with a Hex package with a different name, we should be able to simply change the name and version of the package we download to the patched one. As far as pubgrub is concerned, the package's available versions will match those of the post-patch one instead, so version resolution should follow normally.

    • Note that solving problem 2 would allow using multiple versions of the same package within a project as a side effect, by binding them to different names. Personally, I don't see this as a problem, as it can be necessary (e.g. you want to test a new version of a package but only on specific modules). This sounds cool actually.

Remaining Questions

  1. Are there more problems to solve? E.g. do we want to be able to arbitrarily patch the configuration of dependencies, as rebar3 allows? (I'd like to hear possible use cases for this, as I have none.)

  2. How should we be able to specify the patches? Is an override: true flag on top-level dependencies enough?

  3. Should we ignore patches / overrides from dependencies and only consider overrides from top-level packages?

    • If not, a more complex implementation would be possibly needed. We could just skip this for now with a "yes".