tomasbilek / gwteventservice

Automatically exported from code.google.com/p/gwteventservice
Other
0 stars 0 forks source link

Reconnect after server-restart #46

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?

We use the following properties:
eventservice.time.waiting.max=200001
eventservice.time.waiting.min=0
eventservice.time.timeout=300001
eventservice.reconnect.attempt.count=10
eventservice.connection.id.generator=de.novanic.eventservice.service.connection.
id.SessionConnectionIdGenerator
eventservice.connection.strategy.server.connector=de.novanic.eventservice.servic
e.connection.strategy.connector.longpolling.LongPollingServerConnector

We have a Server-Servlet which sends the events with:
    addEvent(THE_DOMAIN, notification);

The Client receives events with:
    Domain domain = DomainFactory.getDomain(THE_DOMAIN);
    RemoteEventService remoteService = RemoteEventServiceFactory.getInstance().getRemoteEventService();
    remoteService.addListener(domain, new RemoteEventListener() {
      @Override
      public void apply(Event event) {
        if (event instanceof NotificationGwt) {
          GwtNotificationPublisher.getInstance().publish((NotificationGwt) event);
        }
      }
    });

When we start the server and connect with the client to it everything works as 
expected. Events are received by the client.
When we kill the tomcat-server, and restart it. The client tries to reconnect. 
When the server is fully restarted (successfully), the client does not 
reconnect, it stops receiving events.

What is the expected output? What do you see instead?

We would expect the client to reconnect after server-restart and events are 
still sent and received by the client.

GWT-Development-Mode-Console Output:
[INFO]  - Module eeee has been loaded
[INFO]  - Client: Activate RemoteEventConnector for domain "the_domain".
[INFO]  - Client: RemoteEventConnector activated.
=> receiving a notification from server:
[INFO]  - Publishing change notification to subscribers.
=> killing the server and restarting it:
[ERROR] - Client-Error: Error on processing event!
[INFO]  - Client: Reconnecting after error...
[ERROR] - Client-Error: Error on processing event!
[INFO]  - Client: Reconnecting after error...
=> server successfully restarted:
[INFO]  - Client: RemoteEventConnector deactivated.
=> now, events are not received anymore.

What version of the product are you using? On what operating system?

Tomcat v7.0, IE9, gwtEventService-1.2.0
IE9 and Tomcat run for tests on same host.

Please provide any additional information below.

The server takes approx 60 seconds to startup.

Original issue reported on code.google.com by fuellema...@gmail.com on 27 Apr 2012 at 2:29

GoogleCodeExporter commented 8 years ago
[deleted comment]
GoogleCodeExporter commented 8 years ago
In my case we're not using custom properties but depend on the default 
configuration. Hence, GWTEventService should make two reconnect attempts (as 
per 
http://code.google.com/p/gwteventservice/source/browse/trunk/conf/eventservice.p
roperties). However, the HTTP monitor in Firefox shows that no 
(reconnect)requests are sent once a "regular" request fails because the server 
went down.

Original comment by marcel@frightanic.com on 12 Jul 2012 at 3:12

GoogleCodeExporter commented 8 years ago
This functionality is not implemented. The current auto-reconnect feature 
reconnects the client when there is a short connection problem, but not when 
the server is completely restarted or down for several minutes or hours.

The server could also not receive and deliver any events. How should the user 
informed about the server restart? The events of the client/user aren't 
processed and the user should get informed over the UI of your application. 
This can not be done by GWTEventService.

Original comment by sven.strohschein@googlemail.com on 16 Jul 2012 at 5:25

GoogleCodeExporter commented 8 years ago

Original comment by sven.strohschein@googlemail.com on 16 Jul 2012 at 5:35

GoogleCodeExporter commented 8 years ago
I disagree, this can and should be implemented.

It's true, if we have an interactive application it has to handle 
server-not-available situations anyway. Example: server goes down while user is 
filling in a form, he pushes submit -> bang, error.
But what about kiosk-style apps with no or very little user interaction? The 
front-end my be running unattended for hours but still receive events from an 
event service. Then the server goes down for maintenance or software upgrade 
and comes back 10min later. An event service client does detect connection 
failures anyway. Why can't it be more graceful and give the server some slack 
instead of just leaving the game? It could either try to reconnect n times for 
the next p minutes with fixed periods or increase the delay until the next 
reconnect attempt with every failed attempt (eg. 1min, 2min, 4min, 8min 
periods).
Or think about SSO-enabled apps. The server goes down while the user is taking 
a 5min coffee break. When he gets back and continues working he might not even 
notice that the app was down in the mean time. Sure, his old session is gone 
but a new one was created behind the scenes (remember, it's SSO). Of course, it 
depends on the app whether session state is persisted or whether there is a 
server side session state at all. This is a particular bad situation because 
the user doesn't realize he doesn't get and events since everything else works 
just fine.

Original comment by marcel@frightanic.com on 16 Jul 2012 at 6:52

GoogleCodeExporter commented 8 years ago
You are right that it would make sense to support auto-reconnect even after a 
server restart. I have changed the issue to enhancement instead of discarding 
the issue.

As I said before the current implementation handles the case that the client 
loses the connection. That is working as intended. When the server is 
restarted, all information about the registered clients get lost. Therefore it 
is another situation which isn't covered by the current implementation of 
auto-reconnect. That is why I have changed the issue from bug to enhancement.

Original comment by sven.strohschein@googlemail.com on 17 Jul 2012 at 7:35

GoogleCodeExporter commented 8 years ago
> I have changed the issue to enhancement instead of discarding the issue.

Ok, for now that's all I was asking for. I didn't know about the subtle 
differences with a "reconnect" feature. Thanks for the clarifications.
Now, if you have a release road map for the next few months (v1.2.1? v1.3?), 
expected release dates and an idea into which release this enhancement will 
make it I'd be even happier.

Original comment by marcel@frightanic.com on 17 Jul 2012 at 7:42

GoogleCodeExporter commented 8 years ago
Planned for version 1.3.

Original comment by sven.strohschein@googlemail.com on 7 Aug 2012 at 1:49

GoogleCodeExporter commented 8 years ago
In the meantime we built a primitive reconnector ourselves. I'll dump the code 
here with some explanations and a few words about its limits.

Here's one of the two hooks:

GWT.setUncaughtExceptionHandler(new UncaughtExceptionHandler() {

  @Override
  public void onUncaughtException(Throwable e) {
    if (isGwtEventServiceInitException(e)) {
      reconnector.reconnect();
    } else {
      ...
    }
  }
});
private static boolean isGwtEventServiceInitException(final Throwable t) {
  return t instanceof RemoteEventServiceRuntimeException && t.getMessage() != null
      && t.getMessage().startsWith("Error on activating / initializing");
}

Now to the Reconnector itself:

/**
 * Attempts to reconnect the GWTEventService to the server. You may invoke this calls as often as
 * you want. If the number of calls exceeds the object's internal counter (max reconnect attempts)
 * it will display an error message instead. Hence, whenever a connection failure is detected
 * somewhere in the application the reconnector should be called to do its magic.
 */
public class Reconnector {

  private static final int MAX_RECONNECT_ATTEMPTS = 3;
  private static final int RECONNECT_DELAY_BASIS_MILLIS = 1000;
  private int reconnectAttemptsCounter = 0;

  /**
   * C'tor.
   */
  public Reconnector() {
    addUnlistenListener();
  }

  private void addUnlistenListener() {
    getRemoteService().addUnlistenListener(Scope.LOCAL, new UnlistenEventListenerAdapter() {

      @Override
      public void onUnlisten(UnlistenEvent unlistenEvent) {
        GWT.log("UnlistenEvent occured: isTimeout=" + unlistenEvent.isTimeout() + " isLocal="
            + unlistenEvent.isLocal() + " userId="
            + (unlistenEvent.getUserId() == null ? "" : unlistenEvent.getUserId()) + " domains="
            + (unlistenEvent.getDomains() == null ? "" : unlistenEvent.getDomains()));
        reconnect();
      }
    }, null);
  }

  /**
   * Attempts to reestablish standing connection to the server for the event listener.
   */
  public void reconnect() {
    reconnectAttemptsCounter++;
    if (maxReconnectAttemptsReached()) {
      GWT.log("Number of reconnect attempts " + reconnectAttemptsCounter
          + " above max retry limit (" + MAX_RECONNECT_ATTEMPTS
          + ") -> giving up and displaying error message.");
      displayNoConnectionError();
    } else {
      GWT.log("Number of reconnect attempts " + reconnectAttemptsCounter
          + " below or equal the max retry attempts (" + MAX_RECONNECT_ATTEMPTS
          + ") -> reconnecting.");
      attemptReconnect();
    }
  }

  private void attemptReconnect() {
    final Timer timer = new Timer() {

      @Override
      public void run() {
        initGwtEventService();
      }
    };

    timer.schedule(calculateNextReconnectDelay());
  }

  private void initGwtEventService() {
    final RemoteEventService remoteService = getRemoteService();
    remoteService.removeListeners();
    remoteService.removeUnlistenListeners(null);
    ...
    more stuff
    ...
    addUnlistenListener();
  }

  private int calculateNextReconnectDelay() {
    return ((int) Math.pow(reconnectAttemptsCounter, 2.0)) * RECONNECT_DELAY_BASIS_MILLIS;
  }

  private boolean maxReconnectAttemptsReached() {
    return reconnectAttemptsCounter > MAX_RECONNECT_ATTEMPTS;
  }

  private RemoteEventService getRemoteService() {
    return RemoteEventServiceFactory.getInstance().getRemoteEventService();
  }

  private void displayNoConnectionError() {
    ...
  }
}

That works often but not in all cases. It usually works if the application runs 
in a standalone Servlet container (we use Tomcat).
For production, however, the Servlet container often runs behind Apache or IIS. 
In this case the web server returns 503 (service unavailable) rather quickly as 
soon as the connector to the Servlet container detects that the container is 
down. This error is not handled anywhere in the application. There's no 
unlisten event and GWT doesn't invoke the UncaughtExceptionHandler. Hence, it 
goes unnoticed and no reconnection attempts are triggered.
For such cases the remedy might be something like the following (in addition to 
the reconnector or instead):
final Timer timer = new Timer() {

  @Override
  public void run() {
    final RemoteEventService remoteEventService = RemoteEventServiceFactory.getInstance()
        .getRemoteEventService();
    if (remoteEventService.isActive()) {
      schedule(10000);
    } else {
      ...
      only display error message if used in combination with the reconnector
      ...
      or attempt reconnecting if used instead of the unlisten events in the reconnector
      ...
    }
  }
};
timer.schedule(10000);

Original comment by marcel@frightanic.com on 8 Nov 2012 at 9:09