OpenTSDB / opentsdb

A scalable, distributed Time Series Database.
http://opentsdb.net
GNU Lesser General Public License v2.1
5k stars 1.25k forks source link

Skipping exceptions in RegionClient.exceptionCaught #2293

Open vrajesh1989 opened 11 months ago

vrajesh1989 commented 11 months ago

Hello,

We are using Opentsdb 2.4.1 and are hitting CallbackOverflowErrors and StackOverflowErrors in RegionClient.exceptionCaught. We noticed there recent patches available in master branch and are planning to cherry pick. https://github.com/OpenTSDB/opentsdb/issues/839 https://github.com/OpenTSDB/opentsdb/issues/334 But right now we are ignoring these errors in code as follows, to avoid cleaning up of inflight RPCs to the same region server handled by the RegionClient. Could you help commenting if this is a good stop gap solution? Do you see any side effects?

public void exceptionCaught(final ChannelHandlerContext ctx,
                              final ExceptionEvent event) {
    final Throwable e = event.getCause();
    final Channel c = event.getChannel();

    if (e instanceof RejectedExecutionException) {
      LOG.warn("RPC rejected by the executor,"
               + " ignore this if we're shutting down", e);
    } else {
      LOG.error("Unexpected exception from downstream on " + c, e);
    }
    updateExceptionCounter(e.getClass());

    if(e.toString().contains("StackOverflowError") || e.toString().contains("CallbackOverflowError")){
      LOG.error("StackOverflowError, CallbackOverflowError thus returning error without closing channel" + e);
      return;
    }

    if (c.isOpen()) {
      Channels.close(c);  // Will trigger channelClosed(), which will cleanup()
    } else {              // else: presumably a connection timeout.
      cleanup(c);         // => need to cleanup() from here directly.
    }
  }

We are also planning to ignore ClosedChannelExceptions that we get in this method. From https://github.com/netty/netty/issues/724, learnt that Netty 4.0.0 no longer throws these exceptions in exceptionCaught, so we believe ignoring them would be equivalent, instead of an upgrade to Netty 4.0.0. Any thoughts on this? Thanks.

vrajesh1989 commented 11 months ago

@manolama @tsuna Could you please help suggest? Thanks in advance.