Closed GoogleCodeExporter closed 9 years ago
I'm guessing you mean an interner that would use permgen space via
String.intern() (otherwise it's no different from Interners.newWeakInterner()).
I had never considered this idea since we wrote this code in response to the
pitfalls of String.intern().
I have heard not very positive things about how well String.intern() is
implemented and I suspect it's even possible that you're better off with one of
ours (especially once we rewrite it to use MapMaker).
Original comment by kevinb@google.com
on 8 Sep 2010 at 6:34
I haven't heard about problems with String.intern(), although my ear is not
especially close to the ground.
I would assume that it wouldn't be any worse than a weak map. It's native, and
could always be improved if someone working on Java got a wild hair.
On my project, we intern certain Strings read out of the database or
configuration files, as they are likely to be from a small set of repeated
values. I suppose we could just use an Interner for this, as I am already doing
for various immutable collections (reading many records from the database
containing sets of enum attributes. It's worthwhile for me to share references
where possible.)
We can certainly write our own Interner that calls String.intern(). I just
thought it might be worthwhile to add it to Interners, as obviously if one has
code mixing an Interners.newWeakInterner() with String.intern() it would not be
ideal as there would then be two pools of interned Strings.
Original comment by ray.j.gr...@gmail.com
on 10 Sep 2010 at 10:56
Whether to do anything here depends on getting some good benchmarks of Interner
vs. String.intern() performance. And we're going to reimplement Interner a
bit, so results from after that will be most relevant. Holding open for now,
but not much to do just yet.
Original comment by kevinb@google.com
on 19 Mar 2011 at 3:43
Here's something I hadn't thought of before:
char[] bigchars = new char[1000000];
Arrays.fill(bigchars, 'z');
String big = new String(bigchars)
String small = big.substring(5, 5);
'small' is now an empty String, but with a strong reference to the 'bigchars'
array. If I was interning my strings, I would not want it to become the
canonical empty String.
String.intern() appears to always create a new String if it didn't previously
have a mapping (contradicting the javadoc). If I intern small I get a different
reference back, but the same applies to big. :/
Original comment by ray.j.gr...@gmail.com
on 7 Apr 2011 at 11:49
Edit: 'small' has reference to a copy of the 'bigchars' array...
Original comment by ray.j.gr...@gmail.com
on 8 Apr 2011 at 1:55
FYI, rough benchmarking suggests newWeakInterner is 7x as fast as
String.intern(), though I'm not sure how much of that is accounted for by that
string copy you mention, and of course the real-life consequences are highly
situation-dependent.
Original comment by kevinb@google.com
on 6 May 2011 at 5:11
(and thanks to jim.andreou for that benchmark!)
Original comment by kevinb@google.com
on 6 May 2011 at 5:12
Oh come on. You know my numbers aren't citable!
Anyway, *my* conclusion is that it seems safe to call the various Interner
implementations faster than String#intern(). For another uncitable data point,
I noticed that the difference was smaller (like 2x-2.5x) when I tried with
strings with lots of prefix overlap, so there might be some trie hiding under
intern() - Kevin why don't you ask your officemates about this?
Original comment by jim.andreou
on 6 May 2011 at 11:08
Original comment by kevinb@google.com
on 11 May 2011 at 3:44
This issue has been migrated to GitHub.
It can be found at https://github.com/google/guava/issues/<id>
Original comment by cgdecker@google.com
on 1 Nov 2014 at 4:15
Original comment by cgdecker@google.com
on 3 Nov 2014 at 9:09
Original issue reported on code.google.com by
ray.j.gr...@gmail.com
on 11 Aug 2010 at 11:45