vavr-io / vavr

vʌvr (formerly called Javaslang) is a non-commercial, non-profit object-functional library that runs with Java 8+. It aims to reduce the lines of code and increase code quality.
https://vavr.io
Other
5.69k stars 631 forks source link

Guarantee eternal backwards compatibility #1914

Closed skestle closed 6 years ago

skestle commented 7 years ago

I'm looking at using javaslang in a company that has strict api compatibility guidelines that extends to 3rd party libraries.

Since javaslang adopted Semantic Versioning at v2 and has not yet released v3, there is an opportunity to set a guarantee that javaslang will not break current software.

This can be done by suffixing the javaslang package starting from version 3 to javaslang3 etc. Apart from ensuring compatibility, it allows a more gradual transition from old to new apis rather than a hard rewrite the whole code base at once. For example, a project can start using v3 apis without modifying code already verified.

Stating this explicitly will allow me to use it without in-company repackaging.

For prior art, see apache's commons.lang3. Google's guava is an example of a library that we have to repackage to ensure non-conflict.

This is related to #1912 #1913

danieldietrich commented 7 years ago

Moving the version number to the name is like suffixing a C++ variable name with its type.

I think there is no need to do that in order to guarantee compatibility. E.g. in Maven or Gradle the version is already part of the coordinates of an artifact.

Having the version as part of the name is uncommon. The repackaging process is specific to your company. We provide a way that is simple and works for everyone.

Anyway - it is just a name. It is not a formal proof that the library is really backward compatible.

l0rinc commented 7 years ago

Package name changes are meant to ensure that multiple versions can coexist on the same classpath. It's a workaround for the lack of proper Java modularization. Java 9 is meant to mitigate that somehow. I also hope there's no need to rename packages, but a backwards compatibility build plugin would be a good idea (I don't know of any such tool for Java, though).

skestle commented 7 years ago

Yes. If my application uses javaslang v2 and one of my dependencies (or co-deployed applications) upgrades to javaslang v3, then my whole app breaks. Nothing I can do about it because (presumably) some methods and classes fail to exist.

Apache's commons lang shows that it's not specific to our company - it's just rare to find a library that really cares about compatibility (in a world that's presumed to entirely have moved to continuous deployment to production).

Besides, even in "modern" development; as soon as I want to upgrade a dependency and it relies on javaslang, I have no choice but to invest immediate time refactoring my usage of javaslang.

You might as well move this source repo to subversion (by parallel). Or never use git fetch - only ever git pull.

skestle commented 7 years ago

... and I'm don't mind about a proof of compatibility. The Semantic Versioning statement does give us assurance that breaking it is a bug, and we can expect to be able to fix it and contribute back.

nfekete commented 7 years ago

@skestle Let's suppose for a moment that javaslang would use a new package name for its upcoming major release. Would it solve your problem? Is javaslang the only library you're using that is going through an evolution that could hurt backward compatibility? That would mean it's probably the only dependency you're using as most libraries out there do in fact go through a non backward-compatible evolution. If you want to solve this, you could try using classloader isolation (OSGi being probably the most mature solution), but doing so would require some up-front cost, though. Of course, the classes loaded with the different classloaders will not interoperate, but they neither would with a package rename.

danieldietrich commented 7 years ago

I understand. Hard decision.

To me, evolving the library is as important as keeping it stable. I think eternal backward compatibility does not exist. Even Java is not fully backward compatible. For example semantics change: some libs falsely relied on HashMap element order and the algorithm changed in Java 7. It does not matter that it is an illegal usage of an API. It runs in production systems and it works - as long as s.th. changes. Another example is that Java 9 will abandon internal API that was public. Also Java 9 will abandon lambda parameter name reflection, which was non-internal, public API not intended to be used but not explicitly hidden. Noone can guarantee 100% backward compatibility.

Co-existing versions of same transitive dependencies is a big problem every non-trivial Java application faces today. The Java Platform Module System (JPMS) will not solve it.

The key to a stable application are tests - the more the better.

skestle commented 7 years ago

I completely understand that - I used to feel this sort of tension. But I'm now very much of the position that granting version flexibility to API users is of far more value than keeping a "tidy" package structure.

Regardless of the outcome of this ticket, I'll achieve the outcomes I want (we've made hibernate3 retain full backwards compatibility!); but it's definitely worth my effort to avoid an extra process that somebody could stuff up in production.

Perhaps this scenario would help a decision:

I want to use Try in our company's core services, but we have Transactional annotations on our (clustered) services. These will need to be updated to handle failed Try instances that cross a transaction boundary (rather more complicated than raw Exception handling).

How comfortable would you be in promoting javaslang to me without major version repackaging?

NB: The answer cannot be "absolutely" in any situation, since bugs will be introduced and our verification is mandatory (you can never guarantee compatibility as you say). But given the chance that somebody could just drop in an updated jar... (I'm thinking in a single server installation, but even in clusters incredible things happen)

danieldietrich commented 7 years ago

Maybe we have a win-win situation here. I've taken a look at RxJava and how they dealt with the backward compatibility in their major version jump:

RxJava 1:
  group-id:    io.reactivex
  artifact-id: rxjava
  version:     1.x.y
  package:     rx.*

RxJava 2:
  group-id:    io.reactivex.rxjava2
  artifact-id: rxjava
  version:     2.x.y
  package:     io.reactivex.*

Javaslang 1 and 2 have a different group-id because we changed our domain name:

Javaslang 1
  group-id:    com.javaslang
  artifact-id: javaslang
  version:     1.x.y
  package:     javaslang.*

Javaslang 2
  group-id:    io.javaslang
  artifact-id: javaslang
  version:     2.x.y
  package:     javaslang.*

In Javaslang 3 we could change the package name to io.javaslang:

Javaslang 3
  group-id:    io.javaslang
  artifact-id: javaslang
  package:     io.javaslang.*

That would solve the problem (for now). I do not expect a major version jump to 4 the next years.

danieldietrich commented 7 years ago

Ok, then let's do it!

skestle commented 7 years ago

Great!

Off Topic (about OSGi and classloading)?: @nfekete - our applications are hosted on an OSGi environment - and it cannot be a panacea for binary compatibility as might be naively interpreted.

We've disabled version ranges and have gone with simple version minimums - it's hard enough for the average dev to understand how to use and diagnose bundle classloader issues as they are without introducing variant classes that accidentally leak into service APIs / serialisation etc.

I'm wanting something as declarative as Try to be able to be all through the APIs - a downstream bundle might need to consume to APIs that use different versions of javaslang.

Also, I'd say the most advanced I've seen is LightBend's Reactive Platform with ConductR - I was interested when a webcast mentioned OSGi https://www.youtube.com/watch?v=JEPHSdHHWnE&feature=youtu.be&t=2776

skestle commented 7 years ago

That would solve the problem (for now)

I've since realised that for me, the solution only entirely solves the problem for version 2. I can't recommend adoption of a stable 3 release unless there's a commitment to repackaging non-binary-compatible changes.

4 years is a short time in enterprise. I'm in the process of upgrading one application's libraries from ~2007 - if it had been in a shared container...

Anyhow, there's time for that discussion quite a bit later. Food for thought; I'll check up when v3 is released ;).

danieldietrich commented 7 years ago

@skestle

Food for thought; I'll check up when v3 is released ;).

Thank you for the inspiration! Looking forward to further discussion.

skestle commented 7 years ago

Well, you didn't have to look forward long 😛 .

Since we're now at 0.9, can I propose io.vavr.v1 etc?

At v2 you could move the entire package, or move only compatibility-broken classes as they change.

Otherwise I think I need to fix to javaslang 2.0 (presuming that bugfixes will stay in the same package)

I'm getting more confused about this not being implemented. vavr is a "core" library (much like the apache libs that never change - see org.apache.commons.lang3 as an example).

Let alone my application that suddenly has a runtime failure because of an uncommonly used part of my application relies on an API that has changed, and I'd just upgraded. "Forcing" a package change is far more preferable to finding it later in dev or testing as it gives transparency of options (as well as not being a force - each change can be evaluated separately since v1 and v2 could live together indefinitely).

@nfekete - as an indirect answer to your "how many libraries do you depend on..." basic question: have you ever developed with scala (that this is based on)? Particularly around the 0.9, 0.10 (from memory) releases? The inability to consistently use basic libraries until all have upgraded to the version you want to move to (if they ever do). Perhaps the existence of vavr is testament to the fact that larger organisations can't easily embrace [scala] technology if it's not backwards compatible.

Final thought: vavr is (presumably) intended to be an API that helps developer's lives in a way they expect to in java - the API is stable. How can a functional library publish an im-pure API? (where the io.vavr package is an argument to my application function[ality])

nfekete commented 7 years ago

I'm trying to understand the intended use pattern here. Let's suppose you have your module structure you have mentioned earlier in your post:

  • A depends on vavr 1
  • B depends on vavr 2
  • you use latest A and B in your module C

Here are the hypothetical usage patterns I can think of:

  1. You're using vavr in your A and B modules only internally (vavr not being used in the public API), you should be able to use them without problems with classloader isolation.
  2. You're exposing an API which uses vavr types. Let's say, you're using io.vavr.control.Option. In this case, should the Option class evolve in a binary incompatible way, you will have a version conflict. This could be solved by:
    1. vavr repackaging everything in it's major revisions to new package structure, even if Option itself didn't change in an incompatible way: io.vavr1.control.Option and io.vavr2.control.Option. Now you still have two different classes with the same name, but they are different classes and they are not compatible with each other. You'll need to write some kind of translation layer between the two in C, which, at best, will be very cumbersome.
    2. vavr repackaging only those classes that were changed in an incompatible way. Let's say Option itself didn't change in an incompatible way, so it can stay in its base package io.vavr.control.Option. So you can use Option without any further work in your module C. Other vavr classes that evolved in an incompatible way will need to be repackaged in a different package though, and we will end up with per-class naming (either through package name change or class name change) for every major revision of every class. It's easy to see that this will lead to an explosion of the FQCNs of the classes, which would be very confusing for the users of the library.

To me, the main conclusion to these points is, that you cannot really have a public API in your app based on mixed incompatible versions of the library (or any library whatsoever), without significant plumbing work by creating a translation layer between the versions (points 1 and 2.i.) or without employing a very confusing versioning scheme on vavr's side (point 2.ii) which wouldn't save you from the plumbing after all, should the types used in your public API evolve in an incompatible way.

If I were to solve your versioning problem I would go in the direction of unifying the vavr versions used throughout your modules so that they use the same version of the library. If that's not feasible at all for your project, I would try to use some form of classloader isolation (OSGi) if you want them to coexist within the same VM, and some form of serialization for communication between your modules. Serialization could be the bridge for communication between your modules. That would also be a huge step towards making your app distributable, and it's effectively a form of a microservice architecture.

I don't think any form of repackaging of vavr between versions would solve your versioning problem by itself, but it would greatly pollute the very nice API vavr currently has.

skestle commented 7 years ago

@nfekete

To me, the main conclusion to these points is, that you cannot really have a public API in your app based on mixed incompatible versions of the library (or any library whatsoever)

That's it in a nutshell. That's why Java never changes (except to create new functionality in new packages), scala is only adopted in modern enterprises with a full automated test suite, and that any API that cares about it's (perhaps) larger/monolith users versions their packages (apache, not google).

REST Apis are versioned (if you want them to be long term an successful).

I'm not sure if you've dealt with OSGi on a large scale in a large organisation. Version reference ranges lead to large scale problems across a large deployment stack that overwhelms the average developer. We only use minimums and guarantee backward compatibility. It's better than crazy binary API errors.

Our company has 400+ developers and over 3000 modules, many inactive (much like internet modules). If we change, we need to do manual QA to meet industry regulations. Without this change, upgrading to vavr 2.0 across the organisation would probably cost somewhere in the range of $500000 (if 10% of our modules used it) - being a dev, I'm probably $1m out.

For everybody else, they get formal notification of the change and an ability to opt-in at their timing. My A, B, C example was not so much for my companies modules, so much as the normal world of development that (I would have thought most) devs are subject to with maven/ivy/gradle/sbt - seriously, how many devs use OSGi and classpath containers? <<10% of the devs we interview have had any exposure to it.

I really don't understand why I'm still having to reply - there's a reason we have terms of jar/dll/dependency hell. API development is about making user's lives better and consistently so. It can never be about what we wish the world were like. The API provider should contort their view of the world so that the user has magic that just works (Scala taught me that learning it is easy, because the really smart people are on the other side of that API making ridiculously "over-engineered" and complex solutions)

API versioning is a standard practice in development environments where there's no universal way of separating them (Project Jigsaw?). When vavr requires Java 9 or 10 or 11 or whatever, then this can be ignored.

nfekete commented 7 years ago

Please note that I'm not a committer or a decision maker in any way for vavr. I'm just a random dude who cares about the future of this project and throws in his unsolicited thoughts. I just fail to understand (yet), what versioning scheme are you proposing and how would that solve your (and possibly other people's) versioning problem. Don't get me wrong, it might be very well my fault, but I just don't see what do you recommend to this problem, how would it solve the different cases I mentioned earlier and what is the price to pay in terms of vavr API design.

skestle commented 7 years ago

Oh - all I mean is at 1.x.y we have io.vavr.v1 which is always compatible (according to the md documents). At 2.0.0 the package is simply renamed to io.vavr.v2.

Each module can elect the major API they wish (or mix the 2 - most likely when upstream libs use vavr). I guess to allow the dependency resolvers, these would have to be the module identifiers.

Actually, that reminds me of the more annoying problem: I'm using an OSS lib that uses vavr 1, and I'm now stuck at 1 unless I want to go through the process of trying to get a PR into a lib and get it released, or fork or...

We can't consider OSGi a solution IMO - in my experience that (or any other mention of "classloader") excludes over 90% of devs - even seniors these days.

nfekete commented 7 years ago

Okay, then, if I understand correctly, you either don't need to pass around vavr types across module boundaries or you'll need to create glue code which translates between the API versions. I'm seeing code like

class VavrVersionConverters {
    <T> io.vavr.v2.collection.Vector<T> toVavr2(io.vavr.v1.collection.Vector<T> source) {...}
    <T> io.vavr.v1.collection.Vector<T> toVavr1(io.vavr.v2.collection.Vector<T> source) {...}
}

and so on, for every possible type combination you'll need to convert between. With two versions, that's two methods per type, with 3 versions, it's 6.

Or use it inline like this:

io.vavr.v2.collection.Vector<SomeType> v2 = io.vavr.v2.collection.Vector.ofAll(v1);

This is too much boilerplate for my taste, but it could work, nevertheless.

danieldietrich commented 7 years ago

@skestle @nfekete I've read and understood both your point of views (v1 and v2). Both are true (v1 ∧ v2) in my opinion but they lead to contrary results: v1 => a, v2 => ¬a.

We can't have both at the same time a ∧ ¬a.

Maybe it indicates that the versioning problem is not solvable in an adequate way. I think it is no coincidence that the new Java 9 Module System will not solve the versioning problem (see Will There Be Module Hell?).

Trying to solve it will lead to new problems, as @nfekete described.

We should not try to solve the versioning problem in Vavr. It is the job of (build) tools, not the job of a library.

danieldietrich commented 7 years ago

I believe that in a real-world application (with jar hell) the probability is very high that there might occur problems (over time) with overlapping transitive dependencies.

It is a general problem that can’t be healed by just fixing the namespaces of one library. We need a build tools and runtimes that find and solve version problems in general.

skestle commented 7 years ago

@nfekete

you either don't need to pass around vavr types across module boundaries or you'll need to create glue code which translates between the API versions

Or alternatively, you get to automatically pass vavr types across module boundaries, but one of the module breaks because it hasn't actually upgraded to the incompatible code.

So no change whether we version packages or not.

@danieldietrich

We should not try to solve the versioning problem in Vavr. It is the job of (build) tools, not the job of a library.

Yes, it is the job of build tools, but build tools have not done it for 2 decades, and still seem years off. Do we leave this problem broken for the 5-10 years we're going to be waiting?

Even the scala compiler has not yet achieved this goal.

It is a general problem that can’t be healed by just fixing the namespaces of one library

I'm not asking you to heal the problem, I'm asking you to not make it worse.

Really, out of the 100 or so OSS bundles we import, only google guava and hibernate need serious fixing as far as I know. Guava we can namespace; hibernate has to be patched so that it retains compatibility with older versions. Perhaps it's just because OSS Java effectively was Apache, and they really care about this issue.

We need a build tools and runtimes

So since runtimes are going to be 5++ years, we're left with build tools. Perhaps it's just as simple as doing a dual build. Have the general one that forces application wide simultaneous upgrades (at minimum) and causes transitive mayhem, and another that starts with a package rename based on the major version

I'll look to contribute something in this form in a month or few; for the moment, we'll use Javaslang.

However - I'll disagree with

Trying to solve it will lead to new problems, as @nfekete described.

I did not notice @nfekete raise any new problems; just inconveniences. I think convenience is a brilliant thing to sacrifice if it means elimination of a whole class of bugs, and casting away any expectation that major upgrades "will just work".

This problem is solvable for the problem of software quality, as has been demonstrated by apache. The cost is inconvenience; most of which you've subscribed to anyhow with Semantic Versioning - this is just the ultimate step.

nfekete commented 7 years ago

@skestle what would be the difference if we would rename the vavr package in major revisions compared to repackaging vavr with let's say, a tool like Maven Shade Plugin or JarJar? It seems that the problem of repackaging is already solved (unless, of course, there's some non-code reference to package names like strings or property files, IDK how Maven Shade Plugin or JarJar handles those; I don't know of any such references in vavr though).

So, to make my question more clear: what added benefit would bring a package rename in vavr, compared to repackaging with Maven Shade Plugin, considering that there's a huge price to pay for it, specifically all vavr library users would be forced to organize imports throughout the project when migrating, instead of a simple recompile and occasional source fixes where there might be incompatibility.

talios commented 7 years ago

On 25 Apr 2017, at 0:46, Nándor Előd Fekete wrote:

I don't think any form of repackaging of vavr between versions would solve your versioning problem by itself, but it would greatly pollute the very nice API vavr currently has.

The only solution here I see, would be further, and finer grained separation of packages, and release granularity - so that not everything has to change at the same pace. However, due to the nature of how Java the language/runtime is put together, there's only so far one can go with that.

Having a bare minimum Vavr of io.vavr.control might be useful, but since A LOT of functionality is implemented as default methods on Value - there's a fairly large surface area that has to come along with it, that being said - an io.vavr.base module or the like could be extracted to provide basic things - altho due to things being structured the way they are, those base types church fairly often as far as performance/optimization goes, as well as new functionality.

If only we had proper extension methods and not dumbed down default methods....

-- Mark Derricutt http://www.theoryinpractice.net http://www.chaliceofblood.net http://plus.google.com/+MarkDerricutt http://twitter.com/talios http://facebook.com/mderricutt

danieldietrich commented 7 years ago

@talios yes, I thought about finer grained packages. But without extension methods we can't separate the conversion methods toXxx() (e.g. toList()) from Values like Option. Also the new base API interface, that provides shorter factory methods for almost all types, could not be separated from the other packages. Currently modularity is no option for us.

skestle commented 7 years ago

a huge price to pay for it, specifically all vavr library users would be forced to organize imports throughout the project when migrating, instead of a simple recompile and occasional source fixes where there might be incompatibility.

Thankyou for clarifying this position. This absolutely baffles me.

"A huge price"? Select all projects, organise imports. 10 seconds? Faster than the simplest incompatibility fix. Even if you don't use an IDE, figuring out the grep command would be about 1 min.

"A simple recompile"? You don't want everybody using vavr? I would if I made this project. Which means recompilation is not simple, because it involves all your dependencies that use vavr.

Someone will believe you and do as you suggest in a corporate environment, release a milestone, and have broken something somewhere else. This can easily waste a whole day if it gets to testing.

1 user that hits my problem that I describe badly enough could outweigh the entirety of the community (certainly if the community is less than 10000) having to do 1 min of work (on top of the minutes they've already spent wanting the upgrade). At our company, we thought guava was "the new apache commons" and then:

  1. I team decided they wanted v18 stuff
  2. They managed to test their full stack and get through multiple milestones to product release
  3. On a client site (I believe), the product was integrated with one of the options.
  4. Stuff broke - badly.

It must have cost 5 figures to fix it - even with the fast resolution and re-releases we had. This is what your decision can cost.

Besides, you've misrepresented the world I present. My solution forces zero action.

  1. Import vavr 2
  2. Use it in the class you want.
  3. Done - and you can tell your QA dept that only your functionality has changed; no need for system-wide partial regression (as would be required if other modules could be affected)

With versions, you're not forcing any user, or module, or class to take anything they don't subscribe to.

As you say, I have options to repackage. I'm arguing this for that user (ha! only one that would make this mistake) that don't understand the implications.

Declare your API - be proud when you've got something that's different enough that it requires a different name. Make the major vavr API versions pure functions.

Hey, there's nothing stopping you from doing a JarJar or Shade to make an unsafe lib for those who really want it (here be dragons) :p.

nfekete commented 7 years ago

"A huge price"? Select all projects, organise imports. 10 seconds? Faster than the simplest incompatibility fix. Even if you don't use an IDE, figuring out the grep command would be about 1 min.

I think it's not that simple. With type names like Array, BitSet, HashMap, HashSet, Iterator, List, Map, Set, SortedMap, SortedSet, Stream, TreeMap, TreeSet, Vector, organize imports will ask for each and every one of them to select the proper type from a list of matching type names, and I only listed the type names overlapping with JDK types. There might also be cases where someone uses both JDK and vavr types with the same name, and if the vavr type happens to be the one being fully qualified, organize imports won't even touch them, you would need to go and change them manually.

I agree though that you can do it quite quickly with a shell script, which might be the better idea considering the above.

With all these, I think my point that there is actually a price to pay if we take this route, is still valid.

skestle commented 7 years ago

Thanks @nfekete - I hadn't considered that.

Both issues can be produced by trying to do the "simple thing" after update - one type of bug through transient dependencies of the same package, and another by accidental incorrect use of IDE imports for different package.

I still think a find/replace is a more "aware" solution, and a better default. Especially since zero work needs to be done when adding v2 (as opposed to removing v1 at the same time).

I think that's it for me. I'm happy that my points are understood now. If I haven't been convincing enough then close it; I'll do something when vavr's sufficiently more interesting than javaslang for me to spend time.

danieldietrich commented 6 years ago

This is out-of-scope for Vavr.