tomaswolf / gerrit-gitblit-plugin

Integrates standard GitBlit (currently 1.7.1) as a repository viewer in Gerrit
Apache License 2.0
22 stars 5 forks source link

Infinite loop in wicket #28

Open tsuna opened 7 years ago

tsuna commented 7 years ago

Since we upgraded to Gerrit 2.13.5, we regularly have HTTP threads stuck in an infinite loop somewhere in wicket. We're using 2.13.171.2 (the last release at this time).

The stack trace is always exactly the same:

"HTTP-2910" #2910 prio=5 os_prio=0 tid=0x00007f131c01a000 nid=0x20c runnable [0x00007f1503bf8000]
   java.lang.Thread.State: RUNNABLE
    at org.apache.wicket.MarkupContainer.getId(MarkupContainer.java:1134)
    at org.apache.wicket.MarkupContainer.children_indexOf(MarkupContainer.java:1234)
    at org.apache.wicket.MarkupContainer.put(MarkupContainer.java:1400)
    at org.apache.wicket.MarkupContainer.add(MarkupContainer.java:143)
    at org.apache.wicket.markup.repeater.RefreshingView.addItems(RefreshingView.java:191)
    at org.apache.wicket.markup.repeater.RefreshingView.onPopulate(RefreshingView.java:98)
    at org.apache.wicket.markup.repeater.AbstractRepeater.onBeforeRender(AbstractRepeater.java:131)
    at org.apache.wicket.markup.repeater.AbstractPageableView.onBeforeRender(AbstractPageableView.java:121)
    at org.apache.wicket.Component.internalBeforeRender(Component.java:1069)
    at org.apache.wicket.Component.beforeRender(Component.java:1103)
    at org.apache.wicket.MarkupContainer.onBeforeRenderChildren(MarkupContainer.java:1777)
    at org.apache.wicket.Component.onBeforeRender(Component.java:4001)
    at org.apache.wicket.Page.onBeforeRender(Page.java:1555)
    at com.gitblit.wicket.pages.BasePage.onBeforeRender(BasePage.java:154)
    at com.gitblit.wicket.pages.RepositoryPage.onBeforeRender(RepositoryPage.java:680)
    at org.apache.wicket.Component.internalBeforeRender(Component.java:1069)
    at org.apache.wicket.Component.beforeRender(Component.java:1103)
    at org.apache.wicket.Component.prepareForRender(Component.java:2297)
    at org.apache.wicket.Page.prepareForRender(Page.java:1545)
    at org.apache.wicket.Component.prepareForRender(Component.java:2329)
    at org.apache.wicket.Page.renderPage(Page.java:912)
    at org.apache.wicket.request.target.component.BookmarkablePageRequestTarget.respond(BookmarkablePageRequestTarget.java:261)
    at org.apache.wicket.request.AbstractRequestCycleProcessor.respond(AbstractRequestCycleProcessor.java:105)
    at org.apache.wicket.RequestCycle.processEventsAndRespond(RequestCycle.java:1287)
    at org.apache.wicket.RequestCycle.step(RequestCycle.java:1358)
    at org.apache.wicket.RequestCycle.steps(RequestCycle.java:1465)
    at org.apache.wicket.RequestCycle.request(RequestCycle.java:545)
    at org.apache.wicket.protocol.http.WicketFilter.doGet(WicketFilter.java:486)
    at org.apache.wicket.protocol.http.WicketFilter.doFilter(WicketFilter.java:319)
    at com.googlesource.gerrit.plugins.gitblit.GerritWicketFilter.doFilter(GerritWicketFilter.java:112)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135)
    at com.google.gerrit.httpd.plugins.HttpPluginServlet.service(HttpPluginServlet.java:239)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:729)
    at com.google.inject.servlet.ServletDefinition.doServiceImpl(ServletDefinition.java:286)
    at com.google.inject.servlet.ServletDefinition.doService(ServletDefinition.java:276)
    at com.google.inject.servlet.ServletDefinition.service(ServletDefinition.java:181)
    at com.google.inject.servlet.ManagedServletPipeline.service(ManagedServletPipeline.java:91)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:85)
    at com.google.gerrit.httpd.GetUserFilter.doFilter(GetUserFilter.java:82)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at com.google.gwtexpui.server.CacheControlFilter.doFilter(CacheControlFilter.java:73)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at com.google.gerrit.httpd.RunAsFilter.doFilter(RunAsFilter.java:122)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at com.google.gerrit.httpd.RequestMetricsFilter.doFilter(RequestMetricsFilter.java:60)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at com.google.gerrit.httpd.AllRequestFilter$FilterProxy$1.doFilter(AllRequestFilter.java:136)
    at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:201)
    at net.bull.javamelody.MonitoringFilter.doFilter(MonitoringFilter.java:178)
    at com.googlesource.gerrit.plugins.javamelody.GerritMonitoringFilter.doFilter(GerritMonitoringFilter.java:65)
    at com.google.gerrit.httpd.AllRequestFilter$FilterProxy$1.doFilter(AllRequestFilter.java:132)
    at com.google.gerrit.httpd.AllRequestFilter$FilterProxy.doFilter(AllRequestFilter.java:105)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at com.google.gerrit.httpd.RequestContextFilter.doFilter(RequestContextFilter.java:75)
    at com.google.inject.servlet.FilterChainInvocation.doFilter(FilterChainInvocation.java:82)
    at com.google.inject.servlet.ManagedFilterPipeline.dispatch(ManagedFilterPipeline.java:120)
    at com.google.inject.servlet.GuiceFilter.doFilter(GuiceFilter.java:135)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1652)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:585)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:221)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.server.handler.RequestLogHandler.handle(RequestLogHandler.java:95)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97)
    at org.eclipse.jetty.server.Server.handle(Server.java:499)
    at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:311)
    at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257)
    at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
    at java.lang.Thread.run(Thread.java:745)

   Locked ownable synchronizers:
    - <0x0000000115a1c060> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
    - <0x0000000115a1c2c0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)
tomaswolf commented 7 years ago

I cannot reproduce this. Can you give precise steps on how to reproduce, which page(s) are affected? Is there anything in the HTTP logs?

tsuna commented 7 years ago

Sorry I haven't had time to collect meaningful debug info. We disabled gitblit for 9 days and the problem didn't reoccur. I'm 90% certain the problem comes from this plugin. We have a Google Search Appliance that crawls and indexes our internal network, and when it crawls gitblit that basically takes down Gerrit. It's not easy for me to track it down to a particular request. We've re-enabled gitblit, I'm going to wait for the problem to happen again and monitor the situation to try to find a clue that will steer us in the right direction.

tsuna commented 7 years ago

Sure enough, a few hours after re-enabling the plugin, the GSA crawled a whole bunch of gitblit URIs and this took Gerrit down again. I'm going through the logs now and will try to replay some queries to see if I can reproduce the issue manually.

tsuna commented 7 years ago

Hmm, we may be chasing two different problems. I added a rule in our haproxy instance that fronts Gerrit to detect and block requests from the GSA, and Gerrit immediately went back to normal. When I filed this bug originally, the situation was different: there were 2 threads that were stuck in an infinite loop. I confirmed this by running jstack on Gerrit repeatedly and seeing that the same two threads were always running with always exactly the same call stack. In today's case, I did not see this particular call stack, and merely blocking the GSA made Gerrit recover immediately.

We've had the GSA for a while, so it's not anything new. Typically when it's crawling gitblit it's issuing around 1 request per second to 1 request every 5-6 second, although trawling through the request logs a bit, I've seen spikes up to 92 requests per second. It's nothing really crazy anyway for a 32-core server with 256GB of RAM and 2-HDD RAID1.

tomaswolf commented 7 years ago

So it's not necessarily an endless loop but just that the webcrawler overloads Gitblit. The only recent change was about those StackOverflow errors... (issue #21 and then #26). Possibly the fix for #21 is not efficient enough... what URLs were those 92 requests? If I knew which pages were accessed, I might try to run some timing experiments locally, and if I find evidence for poor performance, I could try to improve that.

tsuna commented 7 years ago

There was definitely an issue with an endless loop. As I mentioned in my previous comment, I feel like we're tracking two separate issues. One is the crawler overloading Gerrit (and I don't know why this is a problem all of a sudden), the other is the endless loop I first mentioned when I opened this issue. I'm pretty certain it was an endless loop because Gerrit was running at 200% CPU constantly and jstack was showing the same 2 threads running all the time with the exact same stack trace mentioned above.