eclipse-archived / ceylon

The Ceylon compiler, language module, and command line tools
http://ceylon-lang.org
Apache License 2.0
399 stars 62 forks source link

allow module descriptor to override dependency versions #4240

Closed CeylonMigrationBot closed 7 years ago

CeylonMigrationBot commented 9 years ago

[@gavinking] We've talked about letting dependency version conflicts be resolved at the assembly level. But in fact people building reusable libs don't package their lib as an assembly so we need some way to resolve version conflicts at the module level in the module descriptor. What should this look like?

[Migrated from ceylon/ceylon-spec#1134]

CeylonMigrationBot commented 9 years ago

[@jvasileff] I think there is a lot to this, such as enabling optional dependencies of imported modules.

But for versions, I think the default should be 100% automatic. Gradle is constantly bumping versions for me, and it rarely if ever causes a problem. And, importantly, Gradle is responsible for generating a dependency report for me, not the other way around.

If I am forced to manually resolve conflicts, I'll just wind up copying portions of error messages back into the config file. Over time, the config file will just become bloated and inaccurate or over-defined.

In theory, I like the safety of being explicit. But, we'll always have to use the latest required versions anyway.

CeylonMigrationBot commented 9 years ago

[@FroMage] The problem is that of composability: an assembly is an isolated Island with a single override rule. How do you compose two modules with overrides?

CeylonMigrationBot commented 9 years ago

[@gavinking]

How do you compose two modules with overrides?

Well something is doing the composing, either an assembly, or another module. In that case, it's the responsibility of whatever composes them to resolve the conflict.

CeylonMigrationBot commented 9 years ago

[@FroMage] So are the module overrides ignored when another module is orchestrating the composition? Or composed with higher precedence?

CeylonMigrationBot commented 9 years ago

[@gavinking]

If I am forced to manually resolve conflicts, I'll just wind up copying portions of error messages back into the config file. Over time, the config file will just become bloated and inaccurate or over-defined.

I'm not seeing this. It's not some external config file we're taking about here. If we have a module which explicitly depends on things which have a dependency conflict, then it must explicitly resolve them. Otherwise it doesn't need to do anything.

Remember: this problem only arises for exported (shared) dependencies. It simply doesn't arise for unexported dependencies like logging libs or whatever.

CeylonMigrationBot commented 9 years ago

[@gavinking]

Or composed with higher precedence?

Composed with higher precedence. That's the natural thing.

CeylonMigrationBot commented 9 years ago

[@FroMage]

Composed with higher precedence. That's the natural thing.

I'm glad @alesj just volunteerd to fix that problem. I suspect that if by composition you mean "completely override the others" rather than "try to satisfy both bounds" then it's not an np-complete problem.

CeylonMigrationBot commented 9 years ago

[@FroMage]

Remember: this problem only arises for exported (shared) dependencies. It simply doesn't arise for unexported dependencies like logging libs or whatever.

This is only true in the context of jboss modules.

CeylonMigrationBot commented 9 years ago

[@gavinking] It's not an optimization problem. Ignoring for a moment the issue of circular dependencies between modules (which I think is irrelevant in this context because they have to be compiled together anyway), the module dependency graph is a tree. Thus there is a clear ordering to dependency overrides.

CeylonMigrationBot commented 9 years ago

[@gavinking]

This is only true in the context of jboss modules.

What, maven doesn't have a notion of unexported dependencies?

CeylonMigrationBot commented 9 years ago

[@FroMage]

What, maven doesn't have a notion of unexported dependencies

Not only does it not support that, but the flat classpath doesn't either.

CeylonMigrationBot commented 9 years ago

[@gavinking] OK, fine, so perhaps it's not only for unexported dependencies. Look, what is the alternative? Just transparently pick the highest-numbered version? That just doesn't work, because we don't know the "order" from the version number string. Would you use lexicographic order? That would not work for some very common versioning schemes.

CeylonMigrationBot commented 9 years ago

[@FroMage] Well, indeed Maven solves this with version ordering (we already have a scheme implemented based on Debian version comparisons) and overrides, but how it deals with overrides is more complex because it works with version ranges.

CeylonMigrationBot commented 9 years ago

[@gavinking] The thing is that we have no way to know that this is a maven-compatible version string. And if we just decide to support their algorithm, then that is equivalent to Ceylon adopting Maven versioning at least in a de facto sense. I doubt that this is what we want to do. Surely Jigsaw is going to have its own versioning scheme.

CeylonMigrationBot commented 9 years ago

[@FroMage] Sure. The Debian one was just the only one documented properly and works across large ranges of libs from C all the way to Java.

CeylonMigrationBot commented 9 years ago

[@luolong] OSGi has fairly well speced out version resolution rules. What's wrong with learning from them?

CeylonMigrationBot commented 9 years ago

[@FroMage] What's wrong with OSGi version specs is that they mandate a format, that libraries don't use. What we want is a comparator spec that works with every existing formats.

Actually, perhaps it would make sense to make the comparator pluggable in assembly descriptors somehow.

CeylonMigrationBot commented 9 years ago

[@gavinking] Actually I'm coming to the conclusion that we don't need to compare versions at all. Suppose we have:

Then, if module c simply says import logging "1.1", then it resolves that conflict.

In general, in the absence of circular dependencies (which I don't think are relevant here), the modules form a tree, and in a tree there is a partial order.

There's one wrinkle:

Ignoring this wrinkle, then what's nice about this is that from this arise some rules that we can enforce at compilation time and make sure that all conflicts get explicitly resolved.

CeylonMigrationBot commented 9 years ago

[@jvasileff] I don't see how that will work. c may import logging "0.9" which is likely to break a and b.

I also think the wrinkle is a bigger issue. Especially when using existing (maven) modules, their could easily be a dozen or even dozens of version conflicts. We'd basically be explicitly importing all transitive dependencies in a lot of cases, and would need a third party tool to keep everything straight.

CeylonMigrationBot commented 9 years ago

[@gavinking]

c may import logging "0.9" which is likely to break a and b.

Well that's c's problem if c breaks itself, no?

CeylonMigrationBot commented 9 years ago

[@gavinking]

Especially when using existing (maven) modules, their could easily be a dozen or even dozens of version conflicts.

Well the only alternative solution is to just magically pick the "latest" version according to its version number. And if that "latest" version breaks something, you're SOL and have no recourse to fix it yourself.

CeylonMigrationBot commented 9 years ago

[@jvasileff]

Well that's c's problem if c breaks itself, no?

I don't think so. c may have imported 0.9 before adding a and b, so no warning would ever be produced. You would have to inspect the entire dependency tree of every newly imported module. Or, maybe 0.9 was the correct resolution, but it's not now after an update to a. Or maybe, just human error. Machines are better at executing algorithms than we are.

Well the only alternative solution is to just magically pick the "latest" version according to its version number. And if that "latest" version breaks something, you're SOL and have no recourse to fix it yourself.

There is no perfect solution, but in practice, this works pretty well. Most well behaved modules maintain backwards compatibility, and if one doesn't, 1) it's probably full of bugs anyway, and 2) trying to (intentionally or not) force an older version onto some dependency is almost guaranteed to break things.

Regarding "no recourse", you should be able to run a dependency report, and also specify explicit overrides to automatic dependency resolution. But with gradle, I never do. (Yeah, overrides for exclusions and replacement modules, but not version downgrades.)

CeylonMigrationBot commented 9 years ago

[@gavinking]

I don't think so. c may have imported 0.9 before adding a and b, so no warning would ever be produced. You would have to inspect the entire dependency tree of every newly imported module. Or, maybe 0.9 was the correct resolution, but it's not now after an update to a. Or maybe, just human error. Machines are better at executing algorithms than we are.

It seems to me that those exact same problems affect the "choose the latest version" approach.

Regarding "no recourse", you should be able to run a dependency report, and also specify explicit overrides to automatic dependency resolution.

All that's well and good if you're using Maven. But in general you're not. We need something that works when there is no Maven involved. Especially for people like me who think Maven is a steaming pile and will do anything in order to avoid having to interact with it.

CeylonMigrationBot commented 9 years ago

[@jvasileff]

It seems to me that those exact same problems affect the "choose the latest version" approach.

If I specify 0.9, the choose the latest version approach gives me 1.1. Problem solved 99.9% of the time.

All that's well and good if you're using Maven. But in general you're not. We need something that works when there is no Maven involved. Especially for people like me who think Maven is a steaming pile and will do anything in order to avoid having to interact with it.

The fact that Maven is awful has nothing to do with this. There will always be reasons to choose alternate implementations of an API, force an upgrade (or downgrade?), enable an optional dependency, or whatever. If Ceylon is performing dependency resolution, it I think it should also be able to create dependency reports. This helps with reviewing versions, but is also essential for license reviews, etc.

CeylonMigrationBot commented 9 years ago

[@gavinking]

Problem solved 99.9% of the time.

Except not really, because once you start taking into account alpha, beta, cr1, snapshots, build number, timestamps, etc, etc, each module system has its own notion of "latest". We have at least

to account for here. Hell, JBoss even used to have its own versioning standard though I'm not certain if that is still in force. And we won't even always know the original source of a module, because people can import jars into Herd or into their own repo and we won't know the format of that version number.

Sure, resolving 1.1.0 vs 1.1.1 is pretty easy but that just doesn't account for everything we're going to run into out in the wild.

CeylonMigrationBot commented 9 years ago

[@gavinking]

There will always be reasons to choose alternate implementations of an API, force an upgrade (or downgrade?), enable an optional dependency, or whatever.

Right, which is exactly why there must be explicit control over this.

CeylonMigrationBot commented 9 years ago

[@FroMage] Well, all that sounds like it'd really be nice to be able to define pluggable version comparators.

CeylonMigrationBot commented 9 years ago

[@gavinking]

Well, all that sounds like it'd really be nice to be able to define pluggable version comparators.

At what level of granularity?

Like I said, we don't, in general, know the source of a module or what is its versioning scheme, if any.

CeylonMigrationBot commented 9 years ago

[@jvasileff]

Sure, resolving 1.1.0 vs 1.1.1 is pretty easy but that just doesn't account for everything we're going to run into out in the wild.

Even if this is impossible to solve, a 95% solution that works 100% of the time for reasonably named modules is better than allowing 0.9 to be used without warning which will happen quite often and almost always cause problems at runtime.

Right, which is exactly why there must be explicit control over this.

Add "optional", and I agree! I think our difference of opinion on this is largely due to the expected number of overrides involved. If it's one or two, the arguments become less important. If it's 10, 20, or more, manually keeping this straight escalates from dangerous to impossible.

CeylonMigrationBot commented 9 years ago

[@FroMage]

Like I said, we don't, in general, no the source of a module or what is its versioning scheme, if any

Well, the module itself would know and could describe this. The assembly could too.

CeylonMigrationBot commented 9 years ago

[@gavinking]

Well, the module itself would know and could describe this.

How? How would you detect that a jar is from Maven and not from Jigsaw?

CeylonMigrationBot commented 9 years ago

[@FroMage] Well, ok not for Maven and Jigsaw. But it'd work for OSGi and Ceylon modules. Gotta start somewhere. For Maven we have overrides already.

CeylonMigrationBot commented 9 years ago

[@hwellmann] I think this issue is just a symptom caused by the current limitations of Ceylon's module system specification.

Use case: You have an application with a compile-time dependency on the JPA API, and you want to be able to switch between Hibernate and OpenJPA at run-time. There is no official API module javax.persistence, so you have to declare a compile-time dependency either on Hibernate's API or on OpenJPA's API which are equivalent but conflictiing.

Another use case: You have a web application module with a bunch of JSF facelets. Your form beans only depend on CDI and Servlet API, you don't happen to use any javax.faces classes. So you don't have a compile-time dependency on a JSF implementation module and not even on a JSF API module. Still, your web module should be able to specify a run-time requirement of a JSF implementation of version 2.1 or higher.

I don't think JBoss Modules satisfies these requirements (hard to tell, given the lack of documentation), and Ceylon's use of JBoss Modules is even more limiting by nailing dependencies to specific versions (in contrast to the main slot used in WildFly by default, somewhat equivalent to latest).

I would recommend to have a look at the OSGi specifications, R5 or higher, in particular

All of this is very generic and addresses all the issues I mentioned above. OSGi has taken 10 years or more to reach this level of generality from Require-Bundle and Import-Package and I believe they did it for good reasons.

So I'd really like to see some reuse of these concepts in Ceylon. (Not necessarily reusing the implementation, but then again, why not, at least for a proof of concept.)

CeylonMigrationBot commented 9 years ago

[@quintesse] On Mon, Nov 17, 2014 at 8:31 PM, Harald Wellmann notifications@github.com wrote:

I believe they did it for good reasons

I'm not always so sure about that :)

I've heard a lot of complaints from people who work(ed) directly on the OSGi code that it is way over-designed because they tried to solve every imaginable use-case, however theoretical.

We started the other way around by knowingly over-simplifying and (hoping that we can) add the necessary support for more complex situations in the future.

(This is not to say you are not right, giving us real-life problems that Ceylon currently can't handle is exactly what will push this change)

-Tako

CeylonMigrationBot commented 9 years ago

[@jvasileff]

I've heard a lot of complaints from people who work(ed) directly on the OSGi code that it is way over-designed because they tried to solve every imaginable use-case, however theoretical.

We started the other way around by knowingly over-simplifying and (hoping that we can) add the necessary support for more complex situations in the future.

That may be true, but without first class support for compiling, running, and deploying using third party tools for dependency management and application servers with traditional classloader hierarchies, I think Ceylon will be held back while its module system matures.

CeylonMigrationBot commented 9 years ago

[@jvasileff] A lot of this discussion has been about exported dependencies, but it would be nice to have the option of synchronizing versions for non-exported dependencies as well. This would help in a few ways:

  1. Avoid bloat. The need for a max-version constraint is the exception, so why unnecessarily increase an application’s memory footprint and startup time by including several versions of a common dependency?
  2. Avoid the use of old, buggy, inefficient, and vulnerable code. Why prefer older versions by default? That’s not to say we’d want to use the very latest version available, but at least standardize on the latest version used within our app, which presumably we trust to be a “good” version.
  3. Minimize the overall impact of exporting vs. not exporting dependencies. Ideally, there shouldn’t be a massive cascading effect if a module provider finds the need to export a previously non-exported dependency.
  4. Avoid apathy towards maintaining backwards compatibility. In the worst case scenario, the ecosystem would come to rely upon the ability to use incompatible old versions of non-exported dependencies, which would have a detrimental affect on interoperability between modules in general.
  5. Avoid duplicate “static” resources and singletons. One example where this matters is logging. If multiple versions of a logging library are loaded into separate classloaders, each must be configured independently and write to separate log files.

An additional note about 2 - a common practice for libraries today is to require the oldest reasonable version of each dependency, in order to preserve flexibility for users of the library. Application developers may not want to be forced into the latest version of a transitive dependency, in order to avoid regressions, perceived risk using an untested version, or to avoid a forced upgrade of some other major dependency, such as the JDK.

Applications developers, on the other hand, will often wish to use recent versions of important dependencies. In the gradle world, this turns out to be pretty easy. An dependency can simply be added (once) with the desired minimum version, or, even if not specified, at least count on the system to automatically standardize on the latest version used anywhere within the application.

But, a system that caters to the exact version specification of non-exported transitive dependencies breaks this scheme, and instead results in a deployment that is as out-of-date as possible!

CeylonMigrationBot commented 9 years ago

[@luolong]

I've heard a lot of complaints from people who work(ed) directly on the OSGi code that it is way over-designed because they tried to solve every imaginable use-case, however theoretical.

I've heard lot of complaints about Hibernate being problematic or Spring being too bloated or Java being too slow or garbage collection being inferior to reference counting. I do not know about reference counting, but the experience I've had with Hibernate or Spring is that when used properly, these are awesome tools that make the life of a developer much easier than any alternatives I've seen cooked up in the dark corners of some IT department basements...

Same goes for OSGi - if used properly, it is a great tool that has evolved over time to incorporate a ton of experience that any module system should at least consider when making their engineering decisions.

Now, I am half sold on the fixed versions of named module dependencies with possibility of local overrides and assemblies, but this issue is an example of the kinds of problems a modular system needs to be solving one way or another.

I still think though that the way OSGi handles inter module dependencies is far superior to any other modular dependency manager out there, but given the constraints of current design, what I think we can do is to incorporate version override syntax that might look something like this:

module foo.bar "3.0 PR1" {
    import org.slf4j "1.7.7";
    // how do I declare that at runtime, I need to have one of SLF4J implementations
    // and it's dependencies

    import bar.baz "2.8.3";  // depends on xoo.baz "1.7.3"
    import bar.xoo "7.3.2"; // depends on xoo.baz "2.0.1"

    override import xoo.baz "2.1"; // cos this is the newest an I like it
}

The question, that writing this little snippet raised in me are following:

  1. override import is not really a dependency. What if I change my direct dependencies such that none of them require xoo.baz, should module loader still download it?
    (my guess is no, and it would be even better if this could be declared an error at compile time)
  2. How do I handle cases like SLF4J dependency above.
    My code directly depends only on the SLF4J API and it's needed for compiling the module. But, this API needs some kind of implementation library at runtime. I do not really care at this point which though.

The second point also needs to be considered when developing Java EE applications, where you develop against javax.javaee-api "6.0", but deploy in JBoss, Glassfish or similar. I'm sure there are more cases like this one -- I develop against a contract, but run against an implementation.

CeylonMigrationBot commented 9 years ago

[@akberc] Some very good points all around. In answer to the OP, NO it should not, that should be left for assemblies. The reasoning is quite empiric, and the following is a rather verbose and boring perspective from an business web app developer:

First off, regarding Maven: It is frustrating at times and not flexible. However, it gave us remote repositories, GAV, dependency management, nested modules, POM, lifecycles and scopes -- many of which are now industry standards. That does not mean that I have not, in a fit of frustration, vowed not to use the beast in another project. I have, but I have returned to it for solid, stable, predictable builds that others will understand. A developer used to Maven takes 2 minutes to understand the build of a an existing project she has been assigned: a project that has not been touched for 4 years.

Second: In the business world, it is all about maintenance - application servers are being upgraded, security standards are being brought in, directories and access management are changing - and the application has to be maintained to keep working in a continuously changing landscape. Rarely is a rewrite of a business application funded or warranted.

Thirdly: as a business application maintainer or application architect, I estimate and deliver on estimates and have to answer the question 'will it work with ....?' . I want to quickly know -

Based on these, what I foresee for Ceylon is:

Conclusion: it is only at this last stage of end-application assembly that a business developers should feel the need to override individual branches of the dependency tree and/or annotate them as 'provided' so that the resulting deployable can run in a given container. In fact, this is exactly what I do with the Maven dependency tree in the Eclipse m2e plugin to see why something is being included and what I need to exclude or create a server-side custom library for (mark as 'provided').

This is the step which is still mostly manual and requires knowledge of the container, and this is where a future Ceylon runtime (standalone or embedded) can shine, and no J2EE container has flirted this closely with build-time (JBoss is going in the right directions with Modules, and the rise of node.js and Ruby (with Rake) are a testament) : the ability to read the assembly of the deployable and have a consistent run-time classpath and linking strategy, even optimising the assembly further, substituting within the assembly's ranges with 'provided' libraries, reducing the memory footprint. This, of course, implies -

CeylonMigrationBot commented 9 years ago

[@luolong] It seems that this discussion has moved to #4291

CeylonMigrationBot commented 9 years ago

[@gavinking]

override import is not really a dependency. What if I change my direct dependencies such that none of them require xoo.baz, should module loader still download it?

Does it actually hurt if it does?

My code directly depends only on the SLF4J API and it's needed for compiling the module. But, this API needs some kind of implementation library at runtime. I do not really care at this point which though.

Isn't this exactly the usecase which assemblies seek to address?

CeylonMigrationBot commented 9 years ago

[@gavinking] @jvasileff scanning your post, I ask myself if it isn't the case that some of those problem would be solved if we had a way to just throw some modules on a single shared classloader that is accessible to all other modules. Sorta like an emulated "flat classpath".

CeylonMigrationBot commented 9 years ago

[@jvasileff] @gavinking I assume you're referring to my Nov 18th post? I believe that post is full of opinions that would naturally lead more towards a one-version-per-module result and therefore a minimal number of classloaders, but it wasn't a call for a flat classpath per se.

I see the classloader hierarchy as a logical outcome of version resolution. If for some reason we wanted a flat classpath (which would be a separate topic), then I think we'd have to create versioning and import rules that would lead to that result.

If you need to load two classes with the same name, you need two classloaders. If you don't, you don't. (Sure, JBoss modules may throw in a few extra class loaders for isolation, but that's beside the point.)

Maybe you're thinking of a work-around? But so far, I've constrained my thoughts/comments to non-compromised scenarios. No module gets to break the rules and do things like "just throw in some extra classes". So even modules that consider themselves to be applications should be importable by other modules.

CeylonMigrationBot commented 9 years ago

[@LasseBlaauwbroek] In my view, the only correct way to manage non-exported dependencies is to always load them with separate classloaders. Even if there are multiple non-exported dependencies on the same module and version. This way, you make sure that modules can not interfere with eachother using variable toplevels as @jvasileff mentions in the mailinglist. If you use a logger in your module and want it to be configured by another module, you should make the logger shared.

The only downside to this policy I can see is that it can get very bloated with a lot of the same non-shared modules loaded. A solution to this can be that a module author can indicate that it's module does not have any global state and can be loaded only once. I guess this cannot be checked mechanically :-(

CeylonMigrationBot commented 9 years ago

[@sgalles]

If for some reason we wanted a flat classpath (which would be a separate topic)

  • Yes @jvasileff indeed, a flat classpath feature is a separate topic, but I think that it would be immediately useful. At least the proposal of @gavinking is straightforward and would quickly allow a "flat classpath for dummies" that IMHO would make interop POC much more pleasant. And actually I believe that even with an advanced classpath scheme, I would still use the "flat classpath" to perform basic interop tests (and then I'll try to switch to the advanced module scheme if necessary).
CeylonMigrationBot commented 9 years ago

[@jvasileff] @sgalles My intention was to stress the point that the classloader arrangement is not an independent choice, but is rather dictated by the mix of modules that you need to load. So if for some reason (runtime environment, or whatever) you wanted to have only a single classloader, you would need stricter rules. That is, one version per module, throughout.

But I think I understand your point too - some way to formalize or automate the fat-jar approach? I think all of these issues can be very complicated and easily intermixed. Perhaps a separate issue or thread would make sense to discuss shorter term workarounds?

CeylonMigrationBot commented 9 years ago

[@sgalles]

But I think I understand your point too - some way to formalize or automate the fat-jar approach

Yes indeed @jvasileff . That's the "flat classpath" proposed by @gavinking And you're perfectly right, that's probably an other issue (I prefer not to create the issue though, @gavinking will see what he wants to do with all this)

CeylonMigrationBot commented 9 years ago

[@luolong]

override import is not really a dependency. What if I change my direct dependencies such that none of them require xoo.baz, should module loader still download it?

Does it actually hurt if it does?

Well, there's bandwidth and memory consumption concerns...

gavinking commented 7 years ago

For some reason, the dupe issue #5955 was opened for this. Discussion is now there.

gavinking commented 7 years ago

Re-reading over this discussion a couple of years later, I want to note that this is one thing that has changed in the meantime that undermines some of what I argued at the time: we actually do now know the "source" of a module at least in the case of Maven or npm, and we could apply Maven's notion of "latest version" to resolve conflicts between Maven modules.

In fact, I believe that @FroMage already does something like this in certain cases (perhaps only for assemblies?) but perhaps we should do it more generally. I might open an issue, but it would be nice if Stef would clarify to what extent this feature already exists.

In greater generality, we could define a different default "conflict resolution" strategy for each namespace.

cc @jvasileff.

FroMage commented 7 years ago

We do resolve conflicts by selecting the latest version for many systems, but not all. I believe this is the case for --fully-export... at compile-time, and for ceylon classpath, ceylon jigsaw, ceylon run --flat-classpath and probably others.