ceylon / ceylon-spec

DEPRECATED
Apache License 2.0
108 stars 34 forks source link

allow module descriptor to override dependency versions #1134

Open gavinking opened 10 years ago

gavinking commented 10 years ago

We've talked about letting dependency version conflicts be resolved at the assembly level. But in fact people building reusable libs don't package their lib as an assembly so we need some way to resolve version conflicts at the module level in the module descriptor. What should this look like?

jvasileff commented 10 years ago

I think there is a lot to this, such as enabling optional dependencies of imported modules.

But for versions, I think the default should be 100% automatic. Gradle is constantly bumping versions for me, and it rarely if ever causes a problem. And, importantly, Gradle is responsible for generating a dependency report for me, not the other way around.

If I am forced to manually resolve conflicts, I'll just wind up copying portions of error messages back into the config file. Over time, the config file will just become bloated and inaccurate or over-defined.

In theory, I like the safety of being explicit. But, we'll always have to use the latest required versions anyway.

FroMage commented 10 years ago

The problem is that of composability: an assembly is an isolated Island with a single override rule. How do you compose two modules with overrides?

gavinking commented 10 years ago

How do you compose two modules with overrides?

Well something is doing the composing, either an assembly, or another module. In that case, it's the responsibility of whatever composes them to resolve the conflict.

FroMage commented 10 years ago

So are the module overrides ignored when another module is orchestrating the composition? Or composed with higher precedence?

gavinking commented 10 years ago

If I am forced to manually resolve conflicts, I'll just wind up copying portions of error messages back into the config file. Over time, the config file will just become bloated and inaccurate or over-defined.

I'm not seeing this. It's not some external config file we're taking about here. If we have a module which explicitly depends on things which have a dependency conflict, then it must explicitly resolve them. Otherwise it doesn't need to do anything.

Remember: this problem only arises for exported (shared) dependencies. It simply doesn't arise for unexported dependencies like logging libs or whatever.

gavinking commented 10 years ago

Or composed with higher precedence?

Composed with higher precedence. That's the natural thing.

FroMage commented 10 years ago

Composed with higher precedence. That's the natural thing.

I'm glad @alesj just volunteerd to fix that problem. I suspect that if by composition you mean "completely override the others" rather than "try to satisfy both bounds" then it's not an np-complete problem.

FroMage commented 10 years ago

Remember: this problem only arises for exported (shared) dependencies. It simply doesn't arise for unexported dependencies like logging libs or whatever.

This is only true in the context of jboss modules.

gavinking commented 10 years ago

It's not an optimization problem. Ignoring for a moment the issue of circular dependencies between modules (which I think is irrelevant in this context because they have to be compiled together anyway), the module dependency graph is a tree. Thus there is a clear ordering to dependency overrides.

gavinking commented 10 years ago

This is only true in the context of jboss modules.

What, maven doesn't have a notion of unexported dependencies?

FroMage commented 10 years ago

What, maven doesn't have a notion of unexported dependencies

Not only does it not support that, but the flat classpath doesn't either.

gavinking commented 10 years ago

OK, fine, so perhaps it's not only for unexported dependencies. Look, what is the alternative? Just transparently pick the highest-numbered version? That just doesn't work, because we don't know the "order" from the version number string. Would you use lexicographic order? That would not work for some very common versioning schemes.

FroMage commented 10 years ago

Well, indeed Maven solves this with version ordering (we already have a scheme implemented based on Debian version comparisons) and overrides, but how it deals with overrides is more complex because it works with version ranges.

gavinking commented 10 years ago

The thing is that we have no way to know that this is a maven-compatible version string. And if we just decide to support their algorithm, then that is equivalent to Ceylon adopting Maven versioning at least in a de facto sense. I doubt that this is what we want to do. Surely Jigsaw is going to have its own versioning scheme.

FroMage commented 10 years ago

Sure. The Debian one was just the only one documented properly and works across large ranges of libs from C all the way to Java.

luolong commented 10 years ago

OSGi has fairly well speced out version resolution rules. What's wrong with learning from them?

FroMage commented 10 years ago

What's wrong with OSGi version specs is that they mandate a format, that libraries don't use. What we want is a comparator spec that works with every existing formats.

Actually, perhaps it would make sense to make the comparator pluggable in assembly descriptors somehow.

gavinking commented 10 years ago

Actually I'm coming to the conclusion that we don't need to compare versions at all. Suppose we have:

Then, if module c simply says import logging "1.1", then it resolves that conflict.

In general, in the absence of circular dependencies (which I don't think are relevant here), the modules form a tree, and in a tree there is a partial order.

There's one wrinkle:

Ignoring this wrinkle, then what's nice about this is that from this arise some rules that we can enforce at compilation time and make sure that all conflicts get explicitly resolved.

jvasileff commented 10 years ago

I don't see how that will work. c may import logging "0.9" which is likely to break a and b.

I also think the wrinkle is a bigger issue. Especially when using existing (maven) modules, their could easily be a dozen or even dozens of version conflicts. We'd basically be explicitly importing all transitive dependencies in a lot of cases, and would need a third party tool to keep everything straight.

gavinking commented 10 years ago

c may import logging "0.9" which is likely to break a and b.

Well that's c's problem if c breaks itself, no?

gavinking commented 10 years ago

Especially when using existing (maven) modules, their could easily be a dozen or even dozens of version conflicts.

Well the only alternative solution is to just magically pick the "latest" version according to its version number. And if that "latest" version breaks something, you're SOL and have no recourse to fix it yourself.

jvasileff commented 10 years ago

Well that's c's problem if c breaks itself, no?

I don't think so. c may have imported 0.9 before adding a and b, so no warning would ever be produced. You would have to inspect the entire dependency tree of every newly imported module. Or, maybe 0.9 was the correct resolution, but it's not now after an update to a. Or maybe, just human error. Machines are better at executing algorithms than we are.

Well the only alternative solution is to just magically pick the "latest" version according to its version number. And if that "latest" version breaks something, you're SOL and have no recourse to fix it yourself.

There is no perfect solution, but in practice, this works pretty well. Most well behaved modules maintain backwards compatibility, and if one doesn't, 1) it's probably full of bugs anyway, and 2) trying to (intentionally or not) force an older version onto some dependency is almost guaranteed to break things.

Regarding "no recourse", you should be able to run a dependency report, and also specify explicit overrides to automatic dependency resolution. But with gradle, I never do. (Yeah, overrides for exclusions and replacement modules, but not version downgrades.)

gavinking commented 10 years ago

I don't think so. c may have imported 0.9 before adding a and b, so no warning would ever be produced. You would have to inspect the entire dependency tree of every newly imported module. Or, maybe 0.9 was the correct resolution, but it's not now after an update to a. Or maybe, just human error. Machines are better at executing algorithms than we are.

It seems to me that those exact same problems affect the "choose the latest version" approach.

Regarding "no recourse", you should be able to run a dependency report, and also specify explicit overrides to automatic dependency resolution.

All that's well and good if you're using Maven. But in general you're not. We need something that works when there is no Maven involved. Especially for people like me who think Maven is a steaming pile and will do anything in order to avoid having to interact with it.

jvasileff commented 10 years ago

It seems to me that those exact same problems affect the "choose the latest version" approach.

If I specify 0.9, the choose the latest version approach gives me 1.1. Problem solved 99.9% of the time.

All that's well and good if you're using Maven. But in general you're not. We need something that works when there is no Maven involved. Especially for people like me who think Maven is a steaming pile and will do anything in order to avoid having to interact with it.

The fact that Maven is awful has nothing to do with this. There will always be reasons to choose alternate implementations of an API, force an upgrade (or downgrade?), enable an optional dependency, or whatever. If Ceylon is performing dependency resolution, it I think it should also be able to create dependency reports. This helps with reviewing versions, but is also essential for license reviews, etc.

gavinking commented 10 years ago

Problem solved 99.9% of the time.

Except not really, because once you start taking into account alpha, beta, cr1, snapshots, build number, timestamps, etc, etc, each module system has its own notion of "latest". We have at least

to account for here. Hell, JBoss even used to have its own versioning standard though I'm not certain if that is still in force. And we won't even always know the original source of a module, because people can import jars into Herd or into their own repo and we won't know the format of that version number.

Sure, resolving 1.1.0 vs 1.1.1 is pretty easy but that just doesn't account for everything we're going to run into out in the wild.

gavinking commented 10 years ago

There will always be reasons to choose alternate implementations of an API, force an upgrade (or downgrade?), enable an optional dependency, or whatever.

Right, which is exactly why there must be explicit control over this.

FroMage commented 10 years ago

Well, all that sounds like it'd really be nice to be able to define pluggable version comparators.

gavinking commented 10 years ago

Well, all that sounds like it'd really be nice to be able to define pluggable version comparators.

At what level of granularity?

Like I said, we don't, in general, no the source of a module or what is its versioning scheme, if any.

jvasileff commented 10 years ago

Sure, resolving 1.1.0 vs 1.1.1 is pretty easy but that just doesn't account for everything we're going to run into out in the wild.

Even if this is impossible to solve, a 95% solution that works 100% of the time for reasonably named modules is better than allowing 0.9 to be used without warning which will happen quite often and almost always cause problems at runtime.

Right, which is exactly why there must be explicit control over this.

Add "optional", and I agree! I think our difference of opinion on this is largely due to the expected number of overrides involved. If it's one or two, the arguments become less important. If it's 10, 20, or more, manually keeping this straight escalates from dangerous to impossible.

FroMage commented 10 years ago

Like I said, we don't, in general, no the source of a module or what is its versioning scheme, if any

Well, the module itself would know and could describe this. The assembly could too.

gavinking commented 10 years ago

Well, the module itself would know and could describe this.

How? How would you detect that a jar is from Maven and not from Jigsaw?

FroMage commented 10 years ago

Well, ok not for Maven and Jigsaw. But it'd work for OSGi and Ceylon modules. Gotta start somewhere. For Maven we have overrides already.

hwellmann commented 10 years ago

I think this issue is just a symptom caused by the current limitations of Ceylon's module system specification.

Use case: You have an application with a compile-time dependency on the JPA API, and you want to be able to switch between Hibernate and OpenJPA at run-time. There is no official API module javax.persistence, so you have to declare a compile-time dependency either on Hibernate's API or on OpenJPA's API which are equivalent but conflictiing.

Another use case: You have a web application module with a bunch of JSF facelets. Your form beans only depend on CDI and Servlet API, you don't happen to use any javax.faces classes. So you don't have a compile-time dependency on a JSF implementation module and not even on a JSF API module. Still, your web module should be able to specify a run-time requirement of a JSF implementation of version 2.1 or higher.

I don't think JBoss Modules satisfies these requirements (hard to tell, given the lack of documentation), and Ceylon's use of JBoss Modules is even more limiting by nailing dependencies to specific versions (in contrast to the main slot used in WildFly by default, somewhat equivalent to latest).

I would recommend to have a look at the OSGi specifications, R5 or higher, in particular

All of this is very generic and addresses all the issues I mentioned above. OSGi has taken 10 years or more to reach this level of generality from Require-Bundle and Import-Package and I believe they did it for good reasons.

So I'd really like to see some reuse of these concepts in Ceylon. (Not necessarily reusing the implementation, but then again, why not, at least for a proof of concept.)

quintesse commented 10 years ago

On Mon, Nov 17, 2014 at 8:31 PM, Harald Wellmann notifications@github.com wrote:

I believe they did it for good reasons

I'm not always so sure about that :)

I've heard a lot of complaints from people who work(ed) directly on the OSGi code that it is way over-designed because they tried to solve every imaginable use-case, however theoretical.

We started the other way around by knowingly over-simplifying and (hoping that we can) add the necessary support for more complex situations in the future.

(This is not to say you are not right, giving us real-life problems that Ceylon currently can't handle is exactly what will push this change)

-Tako

jvasileff commented 10 years ago

I've heard a lot of complaints from people who work(ed) directly on the OSGi code that it is way over-designed because they tried to solve every imaginable use-case, however theoretical.

We started the other way around by knowingly over-simplifying and (hoping that we can) add the necessary support for more complex situations in the future.

That may be true, but without first class support for compiling, running, and deploying using third party tools for dependency management and application servers with traditional classloader hierarchies, I think Ceylon will be held back while its module system matures.

jvasileff commented 10 years ago

A lot of this discussion has been about exported dependencies, but it would be nice to have the option of synchronizing versions for non-exported dependencies as well. This would help in a few ways:

  1. Avoid bloat. The need for a max-version constraint is the exception, so why unnecessarily increase an application’s memory footprint and startup time by including several versions of a common dependency?
  2. Avoid the use of old, buggy, inefficient, and vulnerable code. Why prefer older versions by default? That’s not to say we’d want to use the very latest version available, but at least standardize on the latest version used within our app, which presumably we trust to be a “good” version.
  3. Minimize the overall impact of exporting vs. not exporting dependencies. Ideally, there shouldn’t be a massive cascading effect if a module provider finds the need to export a previously non-exported dependency.
  4. Avoid apathy towards maintaining backwards compatibility. In the worst case scenario, the ecosystem would come to rely upon the ability to use incompatible old versions of non-exported dependencies, which would have a detrimental affect on interoperability between modules in general.
  5. Avoid duplicate “static” resources and singletons. One example where this matters is logging. If multiple versions of a logging library are loaded into separate classloaders, each must be configured independently and write to separate log files.

An additional note about 2 - a common practice for libraries today is to require the oldest reasonable version of each dependency, in order to preserve flexibility for users of the library. Application developers may not want to be forced into the latest version of a transitive dependency, in order to avoid regressions, perceived risk using an untested version, or to avoid a forced upgrade of some other major dependency, such as the JDK.

Applications developers, on the other hand, will often wish to use recent versions of important dependencies. In the gradle world, this turns out to be pretty easy. An dependency can simply be added (once) with the desired minimum version, or, even if not specified, at least count on the system to automatically standardize on the latest version used anywhere within the application.

But, a system that caters to the exact version specification of non-exported transitive dependencies breaks this scheme, and instead results in a deployment that is as out-of-date as possible!

luolong commented 10 years ago

I've heard a lot of complaints from people who work(ed) directly on the OSGi code that it is way over-designed because they tried to solve every imaginable use-case, however theoretical.

I've heard lot of complaints about Hibernate being problematic or Spring being too bloated or Java being too slow or garbage collection being inferior to reference counting. I do not know about reference counting, but the experience I've had with Hibernate or Spring is that when used properly, these are awesome tools that make the life of a developer much easier than any alternatives I've seen cooked up in the dark corners of some IT department basements...

Same goes for OSGi - if used properly, it is a great tool that has evolved over time to incorporate a ton of experience that any module system should at least consider when making their engineering decisions.

Now, I am half sold on the fixed versions of named module dependencies with possibility of local overrides and assemblies, but this issue is an example of the kinds of problems a modular system needs to be solving one way or another.

I still think though that the way OSGi handles inter module dependencies is far superior to any other modular dependency manager out there, but given the constraints of current design, what I think we can do is to incorporate version override syntax that might look something like this:

module foo.bar "3.0 PR1" {
    import org.slf4j "1.7.7";
    // how do I declare that at runtime, I need to have one of SLF4J implementations
    // and it's dependencies

    import bar.baz "2.8.3";  // depends on xoo.baz "1.7.3"
    import bar.xoo "7.3.2"; // depends on xoo.baz "2.0.1"

    override import xoo.baz "2.1"; // cos this is the newest an I like it
}

The question, that writing this little snippet raised in me are following:

  1. override import is not really a dependency. What if I change my direct dependencies such that none of them require xoo.baz, should module loader still download it?
    (my guess is no, and it would be even better if this could be declared an error at compile time)
  2. How do I handle cases like SLF4J dependency above.
    My code directly depends only on the SLF4J API and it's needed for compiling the module. But, this API needs some kind of implementation library at runtime. I do not really care at this point which though.

The second point also needs to be considered when developing Java EE applications, where you develop against javax.javaee-api "6.0", but deploy in JBoss, Glassfish or similar. I'm sure there are more cases like this one -- I develop against a contract, but run against an implementation.

akberc commented 10 years ago

Some very good points all around. In answer to the OP, NO it should not, that should be left for assemblies. The reasoning is quite empiric, and the following is a rather verbose and boring perspective from an business web app developer:

First off, regarding Maven: It is frustrating at times and not flexible. However, it gave us remote repositories, GAV, dependency management, nested modules, POM, lifecycles and scopes -- many of which are now industry standards. That does not mean that I have not, in a fit of frustration, vowed not to use the beast in another project. I have, but I have returned to it for solid, stable, predictable builds that others will understand. A developer used to Maven takes 2 minutes to understand the build of a an existing project she has been assigned: a project that has not been touched for 4 years.

Second: In the business world, it is all about maintenance - application servers are being upgraded, security standards are being brought in, directories and access management are changing - and the application has to be maintained to keep working in a continuously changing landscape. Rarely is a rewrite of a business application funded or warranted.

Thirdly: as a business application maintainer or application architect, I estimate and deliver on estimates and have to answer the question 'will it work with ....?' . I want to quickly know -

Based on these, what I foresee for Ceylon is:

Conclusion: it is only at this last stage of end-application assembly that a business developers should feel the need to override individual branches of the dependency tree and/or annotate them as 'provided' so that the resulting deployable can run in a given container. In fact, this is exactly what I do with the Maven dependency tree in the Eclipse m2e plugin to see why something is being included and what I need to exclude or create a server-side custom library for (mark as 'provided').

This is the step which is still mostly manual and requires knowledge of the container, and this is where a future Ceylon runtime (standalone or embedded) can shine, and no J2EE container has flirted this closely with build-time (JBoss is going in the right directions with Modules, and the rise of node.js and Ruby (with Rake) are a testament) : the ability to read the assembly of the deployable and have a consistent run-time classpath and linking strategy, even optimising the assembly further, substituting within the assembly's ranges with 'provided' libraries, reducing the memory footprint. This, of course, implies -

luolong commented 9 years ago

It seems that this discussion has moved to #1185

gavinking commented 9 years ago

override import is not really a dependency. What if I change my direct dependencies such that none of them require xoo.baz, should module loader still download it?

Does it actually hurt if it does?

My code directly depends only on the SLF4J API and it's needed for compiling the module. But, this API needs some kind of implementation library at runtime. I do not really care at this point which though.

Isn't this exactly the usecase which assemblies seek to address?

gavinking commented 9 years ago

@jvasileff scanning your post, I ask myself if it isn't the case that some of those problem would be solved if we had a way to just throw some modules on a single shared classloader that is accessible to all other modules. Sorta like an emulated "flat classpath".

jvasileff commented 9 years ago

@gavinking I assume you're referring to my Nov 18th post? I believe that post is full of opinions that would naturally lead more towards a one-version-per-module result and therefore a minimal number of classloaders, but it wasn't a call for a flat classpath per se.

I see the classloader hierarchy as a logical outcome of version resolution. If for some reason we wanted a flat classpath (which would be a separate topic), then I think we'd have to create versioning and import rules that would lead to that result.

If you need to load two classes with the same name, you need two classloaders. If you don't, you don't. (Sure, JBoss modules may throw in a few extra class loaders for isolation, but that's beside the point.)

Maybe you're thinking of a work-around? But so far, I've constrained my thoughts/comments to non-compromised scenarios. No module gets to break the rules and do things like "just throw in some extra classes". So even modules that consider themselves to be applications should be importable by other modules.

LasseBlaauwbroek commented 9 years ago

In my view, the only correct way to manage non-exported dependencies is to always load them with separate classloaders. Even if there are multiple non-exported dependencies on the same module and version. This way, you make sure that modules can not interfere with eachother using variable toplevels as @jvasileff mentions in the mailinglist. If you use a logger in your module and want it to be configured by another module, you should make the logger shared.

The only downside to this policy I can see is that it can get very bloated with a lot of the same non-shared modules loaded. A solution to this can be that a module author can indicate that it's module does not have any global state and can be loaded only once. I guess this cannot be checked mechanically :-(

sgalles commented 9 years ago

If for some reason we wanted a flat classpath (which would be a separate topic)

  • Yes @jvasileff indeed, a flat classpath feature is a separate topic, but I think that it would be immediately useful. At least the proposal of @gavinking is straightforward and would quickly allow a "flat classpath for dummies" that IMHO would make interop POC much more pleasant. And actually I believe that even with an advanced classpath scheme, I would still use the "flat classpath" to perform basic interop tests (and then I'll try to switch to the advanced module scheme if necessary).
jvasileff commented 9 years ago

@sgalles My intention was to stress the point that the classloader arrangement is not an independent choice, but is rather dictated by the mix of modules that you need to load. So if for some reason (runtime environment, or whatever) you wanted to have only a single classloader, you would need stricter rules. That is, one version per module, throughout.

But I think I understand your point too - some way to formalize or automate the fat-jar approach? I think all of these issues can be very complicated and easily intermixed. Perhaps a separate issue or thread would make sense to discuss shorter term workarounds?

sgalles commented 9 years ago

But I think I understand your point too - some way to formalize or automate the fat-jar approach

Yes indeed @jvasileff . That's the "flat classpath" proposed by @gavinking And you're perfectly right, that's probably an other issue (I prefer not to create the issue though, @gavinking will see what he wants to do with all this)

luolong commented 9 years ago

override import is not really a dependency. What if I change my direct dependencies such that none of them require xoo.baz, should module loader still download it?

Does it actually hurt if it does?

Well, there's bandwidth and memory consumption concerns...