Closed tea-dragon closed 10 years ago
But which version of guava?
I had to pull guava (tears) from t-digest because I couldn't figure out an acceptable version.
On Wed, Aug 20, 2014 at 7:26 AM, Ian Barfield notifications@github.com wrote:
the fastutil addition should have never been merged without addressing the enormous and almost entirely untouched mountain of code that it pulls in. there are several possible solutions, but I assert that the burden is on someone who cares more about QDigest to implement one.
conversely, guava is tiny by comparison, has a variety of useful classes, and is widely enough used that most downstream users will
see no increase in dependency count anyway.
You can merge this Pull Request by running
git pull https://github.com/addthis/stream-lib kill-fastutil
Or view, comment on, or merge it at:
https://github.com/addthis/stream-lib/pull/79 Commit Summary
- remove fastutil, add guava, add test-slf4j-simple
File Changes
- M pom.xml https://github.com/addthis/stream-lib/pull/79/files#diff-0 (11)
- M src/main/java/com/clearspring/analytics/stream/quantile/QDigest.java https://github.com/addthis/stream-lib/pull/79/files#diff-1 (31)
Patch Links:
- https://github.com/addthis/stream-lib/pull/79.patch
- https://github.com/addthis/stream-lib/pull/79.diff
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79.
Importing version 16 of Java makes it hard to use streamlib with Hadoop (which still is stuck on version 11).
On Wed, Aug 20, 2014 at 8:19 AM, Ted Dunning ted.dunning@gmail.com wrote:
But which version of guava?
I had to pull guava (tears) from t-digest because I couldn't figure out an acceptable version.
On Wed, Aug 20, 2014 at 7:26 AM, Ian Barfield notifications@github.com wrote:
the fastutil addition should have never been merged without addressing the enormous and almost entirely untouched mountain of code that it pulls in. there are several possible solutions, but I assert that the burden is on someone who cares more about QDigest to implement one.
conversely, guava is tiny by comparison, has a variety of useful classes, and is widely enough used that most downstream users will
see no increase in dependency count anyway.
You can merge this Pull Request by running
git pull https://github.com/addthis/stream-lib kill-fastutil
Or view, comment on, or merge it at:
https://github.com/addthis/stream-lib/pull/79 Commit Summary
- remove fastutil, add guava, add test-slf4j-simple
File Changes
- M pom.xml https://github.com/addthis/stream-lib/pull/79/files#diff-0 (11)
- M src/main/java/com/clearspring/analytics/stream/quantile/QDigest.java https://github.com/addthis/stream-lib/pull/79/files#diff-1 (31)
Patch Links:
- https://github.com/addthis/stream-lib/pull/79.patch
- https://github.com/addthis/stream-lib/pull/79.diff
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79.
not sure if that is hilarious or sad. In any event, they can feel free to continue to use 2.7.0 until they upgrade guava. Apparently using ancient versions of things is their jam so i'm sure they won't mind. There is a good chance that forcing versions via dependency management (in one direction or the other) will 'just work', but there is no chance of me investigating that myself.
On Wed, Aug 20, 2014 at 11:21 AM, Ted Dunning notifications@github.com wrote:
Importing version 16 of Java makes it hard to use streamlib with Hadoop (which still is stuck on version 11).
On Wed, Aug 20, 2014 at 8:19 AM, Ted Dunning ted.dunning@gmail.com wrote:
But which version of guava?
I had to pull guava (tears) from t-digest because I couldn't figure out an acceptable version.
On Wed, Aug 20, 2014 at 7:26 AM, Ian Barfield notifications@github.com
wrote:
the fastutil addition should have never been merged without addressing the enormous and almost entirely untouched mountain of code that it pulls in. there are several possible solutions, but I assert that the burden is on someone who cares more about QDigest to implement one.
conversely, guava is tiny by comparison, has a variety of useful classes, and is widely enough used that most downstream users will
see no increase in dependency count anyway.
You can merge this Pull Request by running
git pull https://github.com/addthis/stream-lib kill-fastutil
Or view, comment on, or merge it at:
https://github.com/addthis/stream-lib/pull/79 Commit Summary
- remove fastutil, add guava, add test-slf4j-simple
File Changes
- M pom.xml https://github.com/addthis/stream-lib/pull/79/files#diff-0 (11)
- M src/main/java/com/clearspring/analytics/stream/quantile/QDigest.java https://github.com/addthis/stream-lib/pull/79/files#diff-1 (31)
Patch Links:
- https://github.com/addthis/stream-lib/pull/79.patch
- https://github.com/addthis/stream-lib/pull/79.diff
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79.
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79#issuecomment-52793754.
I think having a minimal set of depdencies has been good for stream-lib adoption, and tt's gratifying that it's made it's way into the base of a number of depdency sets. If we could kill fastutil without adding guava I think that would be a net improvement. Vendoring a handful of files (like I think we did for one of the digests) is an option.
we could kill fastutil without adding guava, but you would have to be insane not to still consider it a net improvement to do both. I have seen more than a few people or pom file comments complain about the fastutil dependency. I don't believe anyone really cares about guava.
On Wed, Aug 20, 2014 at 11:46 AM, cburroughs notifications@github.com wrote:
I think having a minimal set of depdencies has been good for stream-lib adoption, and tt's gratifying that it's made it's way into the base of a number of depdency sets. If we could kill fastutil without adding guava I think that would be a net improvement. Vendoring a handful of files (like I think we did for one of the digests) is an option.
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79#issuecomment-52797581.
In addition to the comments in this ticket we asked about guava in the mailing list before https://groups.google.com/forum/#!searchin/stream-lib-user/guava/stream-lib-user/HHqaRtYCywU/FRsgNUNGei8J
I don't believe anyone really cares about guava
The issue with fastutil is its weight. It is seldom used, so it is an extra cost for almost every user of stream-lib.
The issue with guava is version management and classpath hell. Guava is so much used that you have to be very careful about the version and API you use: nor too recent, nor deprecated, nor unstable. Even if the Guava team is very conservative, I have see a lot of version conflicts with Guava... Some of them cannot be solved.
I am quite sure you don't want to prevent stream-lib usage from Hadoop or other large projects so you should run integration test with several versions of Guava...
I have no plans to try to delete stream-lib 2.7.0 from maven central. If a large project is already using a wide array of hideously out of date dependencies, then they are one hundred percent free to add stream-lib to that list if it.
"version management and classpath hell" sounds like a hadoop problem more than anything else.
stream-lib is in need of serious improvements in a wide variety of areas. Anything that discourages or slows development should be enemy number one. Otherwise we'll probably end up with just as many options available to hadoop and friends as we have now and would anyway -- 2.7.0 and a small bug fix once every three months.
On Wed, Aug 20, 2014 at 12:09 PM, Clément MATHIEU notifications@github.com wrote:
I don't believe anyone really cares about guava
The issue with fastutil is its weight. It is seldom used, so it is an extra cost for almost every user of stream-lib.
The issue with guava is version management and classpath hell. Guava is so much used that you have to be very careful about the version and API you use: nor too recent, nor deprecated, nor unstable. Even if the Guava team is very conservative, I have see a lot of version conflicts with Guava... Some of them cannot be solved.
I am quite sure you don't want to prevent stream-lib usage from Hadoop or other large projects so you should run integration test with several versions of Guava...
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79#issuecomment-52800892.
We've received great contributions to stream-lib from people working on projects that depend on the open source echo system. I hope for future contributions and I want those people and projects to be able to benefit from future improvements to stream-lib. My suggestion is that we investigate incorporating the minimum set of guava files into this project (as Chris suggested) or consider providing multiple build artifacts of future stream-lib releases that use a variety of tested guava versions. I'd like it to be as easy as possible for people to use stream-lib in their own environments.
Matt
On Wed, Aug 20, 2014 at 12:24 PM, Ian Barfield notifications@github.com wrote:
I have no plans to try to delete stream-lib 2.7.0 from maven central. If a large project is already using a wide array of hideously out of date dependencies, then they are one hundred percent free to add stream-lib to that list if it.
"version management and classpath hell" sounds like a hadoop problem more than anything else.
stream-lib is in need of serious improvements in a wide variety of areas. Anything that discourages or slows development should be enemy number one. Otherwise we'll probably end up with just as many options available to hadoop and friends as we have now and would anyway -- 2.7.0 and a small bug fix once every three months.
On Wed, Aug 20, 2014 at 12:09 PM, Clément MATHIEU < notifications@github.com> wrote:
I don't believe anyone really cares about guava
The issue with fastutil is its weight. It is seldom used, so it is an extra cost for almost every user of stream-lib.
The issue with guava is version management and classpath hell. Guava is so much used that you have to be very careful about the version and API you use: nor too recent, nor deprecated, nor unstable. Even if the Guava team is very conservative, I have see a lot of version conflicts with Guava... Some of them cannot be solved.
I am quite sure you don't want to prevent stream-lib usage from Hadoop or other large projects so you should run integration test with several versions of Guava...
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79#issuecomment-52800892.
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79#issuecomment-52803136.
Classpath hell is not hadoop specific, it is an issue for any large project/ecosystem and for any widely used library.
Fell free to add guava, just be aware that you could end up we lot of complaint from Hadoop users at least. Last time I checked streaming algorithms were about bit manipulation and "main issues" of stream-lib were its design, readability and not so great performance. Guava might help or could motivate people to write a clean implementation of HLL+ from scratch in case of troubles.
AFAIK Spark depends on stream-lib now. Make sure you are compatible with their version of guava.
In all honesty, all I care about is the existence of incrementally more optimal algorithms and implementations thereof. To that end, someone being motivated to write new implementations of anything sounds great to me. I haven't done any real comparisons, but I have no qualms about linking to the existence of this project, for example.
That said, I have clearly overlooked the feelings of some of the stream-lib users. I am content to contribute better programs in less contested ways. I have pushed one last commit that revokes guava's non-test status and addresses a mistakenly leaked test dependency. I will note that the project I linked to earlier also uses fastutil so it is possible that it is worth considering keeping it with plans to expand usage in the future. My personal opinion is that fastutil's implementations are disappointing and that the classes should at least be filtered, but do as you will.
It looks like the scope of the guava dependency in the pull request is "test". I don't understand why there is so much resistance to adding guava. Should we keep the scope as "test" and add optional as true? Will that address the issues?
A test dependency doesn't have nearly the problems of a non-test dependency. Guava just got moved to test scope.
On Wed, Aug 20, 2014 at 10:31 AM, mspiegel notifications@github.com wrote:
It looks like the scope of the guava dependency in the pull request is "test". I don't understand why there is so much resistance to adding guava. Should we keep the scope as "test" and add optional as true? Will that address the issues?
— Reply to this email directly or view it on GitHub https://github.com/addthis/stream-lib/pull/79#issuecomment-52812318.
the fastutil addition should never have been merged without addressing the enormous and almost entirely untouched mountain of code that it pulls in. there are several possible solutions, but I assert that the burden is on someone who cares more about QDigest to implement one.
conversely, guava is tiny by comparison, has a variety of useful classes, and is widely enough used that most downstream users will see no increase in dependency count anyway.