Closed GoogleCodeExporter closed 9 years ago
Thanks very much! Bob has agreed to fix this issue. He's decided not to use
your
patch, and will fix it a slightly different way, but thanks for going to the
extra
effort anyway.
Original comment by kevin...@gmail.com
on 28 Jul 2008 at 7:00
A fix is a fix is a fix... :) There is always more then one way - I just took
the
fastest and least complicated way I could think of.
Thanks.
Original comment by nairb...@gmail.com
on 28 Jul 2008 at 7:43
FYI: I just tested the new version of FinalizableReferenceQueue with Guice on
an OSGi
container, and I can confirm it now allows the classloader to be garbage
collected :)
Original comment by mccu...@gmail.com
on 26 Nov 2008 at 5:24
This isn't solved for me (under Glassfish) but I'm not 100% sure I am
interpreting
the results correctly. See
http://groups.google.com/group/google-guice/msg/ff53f95f422edabf
Original comment by gili.tza...@gmail.com
on 27 Nov 2008 at 2:31
I'd need to know more about the way you're loading Guice into the classloader
hierarchy to comment, and also if you're seeing any warnings or exceptions
relating
to the Finalizer thread. What I can say is that it does unload OK under OSGi,
which
suggests that something else is keeping the classloader alive in your case.
Note that the Finalizer thread won't shutdown until the Guice classloader is
eligible
for GC, so it will appear as a GC root until that happens - you may think it
looks
like a leak, but actually it will clear itself once other references to the
Guice
classloader are cleared. So I'd dig deeper into the heapdump for other causes.
You may also need to force finalization and GC a few times using the System API.
Original comment by mccu...@gmail.com
on 27 Nov 2008 at 3:01
I did some digging and found the problem - when running under JEE it seems
there are
several additional references from the Finalizer thread to the container
classloader
via access control contexts (presumably because of additional protection
domains).
Here's the complete list of references I found and some possible solutions:
1. Finalizer -> Thread.inheritedAccessControlContext -> ... -> Web CL
Set in Thread constructor to 'AccessController.getContext()' ie. the
context of the calling thread, this may refer to the Web classloader.
Solution: wrap creation of Finalizer thread in doPrivileged block
2. URLClassLoader -> URLClassLoader.acc -> ... -> Web CL
As above URLClassLoader constructor sets 'acc' field to current context.
Solution: doPrivileged block doesn't help here, the only way I found to
avoid this reference was to use reflection to null the 'acc' field which
is yucky, anyone know a better option? Perhaps we could grab defineClass
from the system classloader (again with reflection) and use that to load
the Finalizer into the system classloader, but this is also yucky :(
3. Finalizer -> Thread.inheritableThreadLocals -> ... -> Web CL
Each thread inherits an inheritableThreadLocals map from its parent, and
this could include a reference back to the container classloader (depends
on the app) - again no easy fix for this one, had to resort to reflection
to null out the field :(
4. Finalizer -> Thread.contextClassLoader -> Web CL
Yet another setting inherited from the calling thread, but at least here
there's a public API 'setContextClassLoader()' we can use to clear it :)
With these references removed I can completely unload Guice from GlassFish v2.
I've attached a patch in case people are interested - but because of the hacky
use of reflection to clear certain private fields, I'm not too happy with it.
I'm wondering instead whether it would be better to provide a public API people
could use to shutdown the thread, which would then make a lot of this code moot.
Anyway, Happy Thanksgiving!
Original comment by mccu...@gmail.com
on 28 Nov 2008 at 3:14
Attachments:
Nice detective work, mcculls!
Original comment by crazybob...@gmail.com
on 30 Nov 2008 at 2:38
FYI here are the safe fixes that I think should be used in google-collections,
such
as the use of doPrivileged() when creating the classloader and the Finalizer
thread.
The TCCL should also be cleared, just in case.
The two other references (possible inherited thread-local and access control
context)
are best fixed in the web container - certainly the thread-local issue is
fixable in
GlassFish. However, still not sure about the remaining access control context
issue.
Original comment by mccu...@gmail.com
on 2 Dec 2008 at 9:55
Attachments:
We'll address all of this in time for 1.0. I'm in favor of the reflection hacks
over
an explicit shutdown hook.
Original comment by crazybob...@gmail.com
on 17 Mar 2009 at 9:48
Original comment by kevin...@gmail.com
on 18 Mar 2009 at 2:21
Original comment by kevin...@gmail.com
on 17 Sep 2009 at 5:45
Original comment by kevin...@gmail.com
on 17 Sep 2009 at 5:46
Seems this won't be in 1.0 after all.
Original comment by kevin...@gmail.com
on 16 Oct 2009 at 9:01
Could you please reconsider this? Resource leak like this leads to frequent
PermGenErrors and thus to app server restarts, causing developers to need to
wait for
several minutes on each restart. Additionally, there are other redeployment
issues at
least under Windows because of file-locking (jar files can't be deleted or
moved).
Unfortunately there aren't any (usable) workarounds, so it would be really
great if
this could be included in 1.0.
Original comment by syva...@yahoo.com
on 26 Oct 2009 at 2:06
As for workarounds: I've experimented with a ServletContextListener that finds
any
Finalizer Threads and clears all the fields that Stuart mentioned.
http://pastie.org/686765 . I've tested for Guice Finalizer under Jetty and
Tomcat and
it works.
For Servlet environments afaik ServletContextListener is the only place you're
not
discouraged or explicitly forbidden to spawn threads. Therefore I am using a
custom
Finalizer that creates just one Finalizer in SCL and makes sure it is disposed
at the
end.
Original comment by alen_vre...@yahoo.com
on 6 Nov 2009 at 6:13
Will this cause a resource leak merely by including the jar on the classpath?
Or
will it only cause a problem if FinalizableReferenceQueue or classes using it
(e.g.
MapMaker) is used/referenced? If the latter is the case, one could presumably
work
around the issue by making sure that the class isn't referenced.
Some sort of fix or workaround seems like a fairly critical thing to have,
since I
imagine many (most?) people who would want to use the google collections
library are
running under JEE. The PermGen issue won't necessarily occur immediately as
they
may not redeploy right away, so users may not even know why PermGen errors
suddenly
started happening when they redeploy their war files.
Original comment by pseudony...@gmail.com
on 12 Nov 2009 at 12:26
Any news on this issue?
Original comment by terciofi...@gmail.com
on 11 Apr 2010 at 9:49
No. Bob was going to do it but... then he didn't. Bob?
Original comment by kevinb@google.com
on 12 Apr 2010 at 11:43
Today I received a PermGen error from my beloved Tomcat, for wich the
JreMemoryLeakPreventionListener logged the error message "SEVERE: A web
application
appears to have started a thread named [com.google.inject.internal.Finalizer]
but has
failed to stop it. This is very likely to create a memory leak.".
After some research, I figured this has to do with the issue addressed here, so
I'm
wondering, is it about to be fixed, or is there some API to close the thread
started by
Guice?
Original comment by pawjans...@gmail.com
on 29 May 2010 at 9:48
Original comment by kevinb@google.com
on 30 Jul 2010 at 3:53
Original comment by kevinb@google.com
on 30 Jul 2010 at 3:56
Issue 382 has been merged into this issue.
Original comment by kevinb@google.com
on 8 Sep 2010 at 7:00
No news about this issue? Does anybody have a workaround?
Original comment by terciofi...@gmail.com
on 9 Sep 2010 at 8:14
I talked to Bob about it today. We're considering reading a system property --
shocking, I know -- that would tell MapMaker not to start any background thread
(but perform cleanup in the user thread). Using that would make this problem
go away for you.
Original comment by kevinb@google.com
on 10 Sep 2010 at 6:36
I might be missing something, but couldn't you just add a shutdown method?
The problem is most severe in app servers and often you can't add system
properties there, so the workaround wouldn't be very widely applicable.
You could just document the shutdown method as unsupported and not part of
public API. Perhaps even advice that it's existence should be checked by
reflection (and thus also called by reflection).
Original comment by syva...@yahoo.com
on 10 Sep 2010 at 6:57
The suggested solution of using a System property is unfortunately useless for
a vast number of deployment scenarios and does nothing to fix the problem. +1
for just not creating a Finalizer thread in a library and instead piggybacking
expiry on user-level operations every <hopefully customizable n> operations
instead.
Here's another idea that builds on this. How about requiring a user-supplied
Executor in which to run instead, as additional argument to the expiration()
builder method or the public FinalizerQueue class? This decouples enabled
expiry from having to create a physical thread at all, and allows proper
lifecycle control by the caller, instead of forcing the library to fly blind
and make guesses about what to do. The caller then is not only responsible, but
also now actually in a position (!) to do environment-specific lifecycle
control: quiesced waiting, forced shutdown, Bundle.stop(),
ServletContextListener, whatever.
While we're at it, the same Executor could also be used to invoke the listener,
if added by the builder.
One new caveat would be that a user can now share an Executor between multiple
CCHMs. If the Executor had really only one or any other fixed number of
threads, then too many continuously running-and-blocking-without-timeout
Finalizer.run() in their current form might start clogging the Executor. If
reference cleaning were batched per CCHM, the run() loop could terminate after
doing whatever is necessary.
As far as I can tell this solves the Finalizer thread lifecycle problem (caused
really by the nondeterministic behaviour of its ref queue polling and therefore
non-termination) and also the thread-per-CCHM overhead (n CCHMs use n Finalizer
threads for no really good reason).
Original comment by holger.h...@googlemail.com
on 10 Sep 2010 at 9:08
+1 for a shutdown method, this is much better as in my deployment scenario I
already have a shutdown for my stuffs and I think that the ContextDestroyed
could do this job.
Original comment by terciofi...@gmail.com
on 10 Sep 2010 at 1:00
I'm sorry, if it's true that our intended solution is "useless" and "does
nothing to fix the problem" would it be too much trouble to explain why that is?
Original comment by kevinb@google.com
on 10 Sep 2010 at 1:48
Well, I don't think that this solution is useless nor does nothing to fix the
problem, I just think that isn't the best way to solve it, just that.
Maybe people upon can explain their reasons about that...
Original comment by terciofi...@gmail.com
on 10 Sep 2010 at 2:13
Piggybacking on access is not useless (which is what I wrote), System
properties are - for the reasons that other people have written as well. You
don't know whether the System properties are accessible at all (security), up
to date, or have just changed underneath you on the fly. Different applications
or even different client classes in the same VM can have different opinions
which setting is appropriate for them.
Original comment by holger.h...@googlemail.com
on 10 Sep 2010 at 2:19
Comment #30 made a good point. A system property provides less flexibility if
two or more (web) apps are running on the same VM (server). One app might rely
on the background processing wheres another is ok with running in the user
thread.
Original comment by eclipseguru@gmail.com
on 10 Sep 2010 at 5:07
Issue 517 has been merged into this issue.
Original comment by kevinb@google.com
on 12 Jan 2011 at 8:36
Hi everyone, I know it's been a long time getting this resolved, and I'm
committed to dealing with this over the next few months.
One question though: looking at FinalizableReferenceQueue leads me to believe
that this whole problem can be avoided by loading Guava in the system class
loader. Is this an option for you?
Original comment by fry@google.com
on 13 Jan 2011 at 7:39
Original comment by yrfselrahc@gmail.com
on 13 Jan 2011 at 8:11
Not really, each webapp should run in its own ClassLoader. You might end up
with two different webapps depending on different versions of Guice.
I find the history of this bug quite silly. We could have added a shutdown
method back in 2008 and been done with it. If you magically came up with a
cleaner solution at a later time you could have deprecated and removed this
method. Seeing as this method will be invoked at most once in every
application, removing it (when the time comes) should cause minimal disturbance.
Original comment by cowwoc...@gmail.com
on 13 Jan 2011 at 8:23
Agreed that Guava should be loaded at the webapp level and not the system class
loader.
I'm guessing that the scale that Google operates, you may have one process or
web application per system. Smaller deployments will may have many web
applications loaded in a single application server, e.g. Tomcat, so here you
would want each one to have its own copy of Guava. Or even Google Collections,
if they are very old and haven't been worked on to be updated to use Guava.
You wouldn't want to be constrained in one web application which Guava version
to use in another web application.
Original comment by blair-ol...@orcaware.com
on 16 Jan 2011 at 7:28
FWIW I've put together a potential patch for the related Guice Finalizer issue:
http://code.google.com/p/google-guice/issues/detail?id=288#c30
It uses an Executor approach instead of a shutdown method because that provides
the container with much more flexibility in how it schedules the background
work. It also avoids the potential pitfalls of a shutdown method - namely who's
responsible for calling it, what happens if more work comes in after the
shutdown begins, what happens if the shutdown call blocks, etc. (i.e. same
reasons why Thread.stop() is deprecated).
The patch currently uses a system property to configure the executor - while
this is not ideal, it does allow people to experiment with the feature without
(yet) committing to an API (as this is an implementation detail for Guice it's
harder to come up with something as neat as a builder method on MapMaker...).
Experimenting with the various options I think the Executor approach is
definitely the way to go - how this is configured is another question. Using
something like a system property might be a reasonable compromise if people are
loathe to commit to a particular API at the moment.
Original comment by mccu...@gmail.com
on 17 Jan 2011 at 2:04
[deleted comment]
We don't use Guice, but Spring for dependency injection, throughout our
webapps, so as a little background for us, how would the patch you're working
on impact Guava? I saw a mention that Guice includes a copy of portions of
Guava?
Since we use Spring, I wouldn't mind a public shutdown method that I can use to
properly clean everything up in Guava.
Original comment by blair-ol...@orcaware.com
on 19 Jan 2011 at 12:53
As Holger mentioned back in comment #26, Guava could add selection of the
Executor to the MapMaker API which would be better than using System.properties
- the basic implementation behind the scenes would be similar to the proposed
Guice patch and would work regardless of whether you're using Guice, Spring, or
new.
Executors are better than a single static shutdown method for the same reason
that Thread.stop is deprecated.
Original comment by mccu...@gmail.com
on 19 Jan 2011 at 9:38
Original comment by kevinb@google.com
on 27 Jan 2011 at 1:33
Original comment by yrfselrahc@gmail.com
on 27 Jan 2011 at 3:11
Original comment by fry@google.com
on 22 Mar 2011 at 6:27
Original comment by fry@google.com
on 23 Mar 2011 at 1:49
@fry does this mean there isn't a solution in sight?
Original comment by tj.rothw...@gmail.com
on 23 Mar 2011 at 2:53
It means that I'm tired of changing the milestone. :-)
In practice I have a working solution, but it breaks many internal tests so it
will take a while to work through the carnage. I have high hopes for release 10.
Original comment by fry@google.com
on 23 Mar 2011 at 10:38
May I ask that you please share the approach (or the code) with us so that we
can complain in advance, not just after it has been committed? :-)
Original comment by holger.h...@googlemail.com
on 23 Mar 2011 at 10:54
Sure, the idea is to have a ReferenceQueue inside of each MapMaker (instead of
globally), and to drain it on write operations (or on occasional reads in the
absence of writes).
The internal testing issue is that we have tests which assume that simply
allowing time to pass will result in cleanup, which will no longer be the case.
Does this approach also sound problematic in your use cases?
Original comment by fry@google.com
on 23 Mar 2011 at 10:58
Sounds good to me :)
Original comment by cowwoc...@gmail.com
on 23 Mar 2011 at 12:09
Hitching a ride on other operations (just like in lockfree algorithms) should
work as long as the Map is occasionally used, and getting rid of the threads &
shared queues should fix all the stale/cross classloader issues like web app
undepoy leaking. Very nice! Like it.
Original comment by holger.h...@googlemail.com
on 23 Mar 2011 at 12:22
Original issue reported on code.google.com by
nairb...@gmail.com
on 28 Jul 2008 at 5:39Attachments: