okaywit / guava-libraries

Automatically exported from code.google.com/p/guava-libraries
Apache License 2.0
0 stars 0 forks source link

ServiceManager example deadlocks if service fails on startup #1478

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
The ServiceManager example located at 
http://docs.guava-libraries.googlecode.com/git/javadoc/com/google/common/util/co
ncurrent/ServiceManager.html does not work.

The example deadlocks if any Service fails on startup. The call to 
`manager.stopAsync().awaitStopped(5, TimeUnit.SECONDS);` in the shutdown hook 
never returns.

I can make the example work if I swap the Executor given to manager.addListener 
from a `MoreExecutors.sameThreadExecutor()` to a 
`Executors.newSingleThreadExecutor()` (or other alternative)

Given the following code:

    import com.google.common.collect.ImmutableSet;
    import com.google.common.util.concurrent.*;

    import java.util.Set;
    import java.util.concurrent.Executors;
    import java.util.concurrent.TimeUnit;
    import java.util.concurrent.TimeoutException;
    import java.util.logging.LogManager;
    import java.util.logging.Logger;

    public class DeadlockTest {

        public static class ShutdownIgnoringLogManager extends LogManager {
            @Override
            public void reset() throws SecurityException {
                // override reset() so logging handlers still work in shutdown hooks
            }
        }

        static {
            System.setProperty("java.util.logging.manager", ShutdownIgnoringLogManager.class.getName());
        }

        static class TestService extends AbstractIdleService {
            @Override
            protected void startUp() throws Exception {
                throw new IllegalStateException("failed");
            }

            @Override
            protected void shutDown() {
            }
        }

        public static void main(String[] args) {
            final Logger logger = Logger.getLogger(DeadlockTest.class.getName());

            Set<Service> services = ImmutableSet.<Service>of(new TestService());
            final ServiceManager manager = new ServiceManager(services);
            manager.addListener(new ServiceManager.Listener() {
                public void stopped() {}
                public void healthy() {
                    // Services have been initialized and are healthy, start accepting requests...
                }
                public void failure(Service service) {
                    // Something failed, at this point we could log it, notify a load balancer, or take
                    // some other action.  For now we will just exit.
                    System.exit(1);
                }
            },
                    MoreExecutors.sameThreadExecutor()
                    //Executors.newSingleThreadExecutor()
            );

            Runtime.getRuntime().addShutdownHook(new Thread() {
                public void run() {
                    // Give the services 5 seconds to stop to ensure that we are responsive to shutdown
                    // requests.
                    try {
                        logger.info("Shutting down services");
                        manager.stopAsync().awaitStopped(5, TimeUnit.SECONDS);
                        logger.info("Successfully shut down services");
                    } catch (TimeoutException timeout) {
                        logger.severe("Shutdown timeout reached");
                    }
                }
            });
            manager.startAsync();  // start all the services asynchronously
        }
    }

Which is near-identical to the example — I've added a bit of logging and an 
actual Service to start.
This happens:

    Jul 17, 2013 5:24:06 PM com.google.common.util.concurrent.ServiceManager$ServiceListener startTimer
    INFO: Starting TestService [NEW]
    Jul 17, 2013 5:24:06 PM com.google.common.util.concurrent.ServiceManager$ServiceListener failed
    SEVERE: Service TestService [FAILED] has failed in the STARTING state.
    java.lang.IllegalStateException: failed
        at DeadlockTest$TestService.startUp(DeadlockTest.java:29)
        at com.google.common.util.concurrent.AbstractIdleService$1$1.run(AbstractIdleService.java:43)
        at java.lang.Thread.run(Thread.java:724)

    Jul 17, 2013 5:24:06 PM com.google.common.util.concurrent.ServiceManager$ServiceListener finishedStarting
    INFO: Started TestService [FAILED] in 33 ms.
    Jul 17, 2013 5:24:06 PM DeadlockTest$2 run
    INFO: Shutting down services

It never completes.

Swapping out the Executor for a different implementation allows the application 
gracefully shutdown.

System:
$ java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

$ uname -a
Darwin Adams-Mac-mini.local 12.4.0 Darwin Kernel Version 12.4.0: Wed May  1 
17:57:12 PDT 2013; root:xnu-2050.24.15~1/RELEASE_X86_64 x86_64

Original issue reported on code.google.com by a...@relational.io on 17 Jul 2013 at 7:26

GoogleCodeExporter commented 9 years ago
Guava version: 14.0.1

Original comment by a...@relational.io on 17 Jul 2013 at 7:31

GoogleCodeExporter commented 9 years ago

Original comment by kak@google.com on 17 Jul 2013 at 7:46

GoogleCodeExporter commented 9 years ago
The issue is that if a listener was configured to use the sameThreadExecutor it 
would hold the internal state lock on the service manager and block other state 
modifying operations.  So if you performed a long running or blocking operation 
you could block state transitions.  So in this case System.exit() never returns 
(by spec) and so the lock is never released and awaitStopped will never get 
called.  In fact an identical issue existed in AbstractService as well.

This was fixed in 
https://code.google.com/p/guava-libraries/source/detail?r=efa4d2b38592c8ab69b560
733eac47d3914e78e7&path=/guava/src/com/google/common/util/concurrent/ServiceMana
ger.java on April 8th by ensuring that no user code (i.e. listeners) are run 
with the lock held.

The workaround of executing the listener in another thread (using the 
singleThreadExecutor() as you suggest) is the best way to work around this in 
guava 14.0.1, sorry for the trouble.

Original comment by lu...@google.com on 2 Aug 2013 at 6:39

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 2 Aug 2013 at 6:49

GoogleCodeExporter commented 9 years ago
This issue has been migrated to GitHub.

It can be found at https://github.com/google/guava/issues/<issue id>

Original comment by cgdecker@google.com on 1 Nov 2014 at 4:12

GoogleCodeExporter commented 9 years ago

Original comment by cgdecker@google.com on 3 Nov 2014 at 9:08