graylog-labs / graylog2-web-interface

[DEPRECATED]
https://www.graylog.org/
611 stars 174 forks source link

Loading quick values is crashing #1489

Closed nilroy closed 9 years ago

nilroy commented 9 years ago

When I am trying to load quick values for a filed for a bigger range of data (5 days or older) graylog-web is throwing 500 server error and after that it crashes. I have to restart graylog-server and graylog-web service to get rid of the situation. I never faced this issue in graylog v1.0. I am running graylog-server and graylog-web version 1.1.2-4 in ubuntu 14.04

edmundoa commented 9 years ago

Hi,

Could you please provide more information about this issue? We need the Graylog server and web interface logs when the issue happens.

Thank you!

nilroy commented 9 years ago

Hi,

I shall try to get some logs when it happens.

Thanks

nilroy commented 9 years ago

Hi I faced this issue again. The issue typically comes if i am searching for logs from 30 days old to till date and once the mesages are loaded and try to load quick values from any field. And in graylog-server log there is nothing. In graylog-web i can see below error logs

at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130) [com.typesafe.play.play_2.10-2.3.9.jar:2.3.9]
 at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4$$anonfun$apply$5.apply(Action.scala:130) [com.typesafe.play.play_2.10-2.3.9.jar:2.3.9]
 at play.utils.Threads$.withContextClassLoader(Threads.scala:21) [com.typesafe.play.play_2.10-2.3.9.jar:2.3.9]
 at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:129) [com.typesafe.play.play_2.10-2.3.9.jar:2.3.9]
 at play.api.mvc.Action$$anonfun$apply$1$$anonfun$apply$4.apply(Action.scala:128) [com.typesafe.play.play_2.10-2.3.9.jar:2.3.9]
 at scala.Option.map(Option.scala:145) [org.scala-lang.scala-library-2.10.4.jar:na]
 at play.api.mvc.Action$$anonfun$apply$1.apply(Action.scala:128) [com.typesafe.play.play_2.10-2.3.9.jar:2.3.9]
 at play.api.mvc.Action$$anonfun$apply$1.apply(Action.scala:121) [com.typesafe.play.play_2.10-2.3.9.jar:2.3.9]
 at play.api.libs.iteratee.Iteratee$$anonfun$mapM$1.apply(Iteratee.scala:483) [com.typesafe.play.play-iteratees_2.10-2.3.9.jar:2.3.9]
 at play.api.libs.iteratee.Iteratee$$anonfun$mapM$1.apply(Iteratee.scala:483) [com.typesafe.play.play-iteratees_2.10-2.3.9.jar:2.3.9]
 at play.api.libs.iteratee.Iteratee$$anonfun$flatMapM$1.apply(Iteratee.scala:519) [com.typesafe.play.play-iteratees_2.10-2.3.9.jar:2.3.9]
 at play.api.libs.iteratee.Iteratee$$anonfun$flatMapM$1.apply(Iteratee.scala:519) [com.typesafe.play.play-iteratees_2.10-2.3.9.jar:2.3.9]
 at play.api.libs.iteratee.Iteratee$$anonfun$flatMap$1$$anonfun$apply$14.apply(Iteratee.scala:496) [com.typesafe.play.play-iteratees_2.10-2.3.9.jar:2.3.9]
 at play.api.libs.iteratee.Iteratee$$anonfun$flatMap$1$$anonfun$apply$14.apply(Iteratee.scala:496) [com.typesafe.play.play-iteratees_2.10-2.3.9.jar:2.3.9]
 at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24) [org.scala-lang.scala-library-2.10.4.jar:na]
 at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24) [org.scala-lang.scala-library-2.10.4.jar:na]
 at akka.dispatch.TaskInvocation.run(AbstractDispatcher.scala:41) [com.typesafe.akka.akka-actor_2.10-2.3.5.jar:na]
 at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) [com.typesafe.akka.akka-actor_2.10-2.3.5.jar:na]
 at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) [org.scala-lang.scala-library-2.10.4.jar:na]
 at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) [org.scala-lang.scala-library-2.10.4.jar:na]
 at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) [org.scala-lang.scala-library-2.10.4.jar:na]
 at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) [org.scala-lang.scala-library-2.10.4.jar:na]

 2015-06-17T11:51:18.130Z - [ERROR] - from org.graylog2.restclient.lib.ApiClient in play-akka.actor.default-dispatcher-2
REST call failed
java.util.concurrent.TimeoutException: Idle connection timeout to /<x.x.x.x>:12900 of 60000 ms
    at com.ning.http.client.providers.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43) ~[com.ning.async-http-client-1.8.14.jar:na]
    at com.ning.http.client.providers.netty.timeout.IdleConnectionTimeoutTimerTask.run(IdleConnectionTimeoutTimerTask.java:54) ~[com.ning.async-http-client-1.8.14.jar:na]
    at org.jboss.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:556) ~[io.netty.netty-3.9.8.Final.jar:na]
    at org.jboss.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:632) ~[io.netty.netty-3.9.8.Final.jar:na]
    at org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:369) ~[io.netty.netty-3.9.8.Final.jar:na]
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) ~[io.netty.netty-3.9.8.Final.jar:na]
    at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_75]
nilroy commented 9 years ago

Also I am surprised to see error in the graylog-web log --

2015-06-17T12:48:45.076Z - [ERROR] - from org.graylog2.restclient.models.UserService in play-akka.actor.default-dispatcher-6
Unauthorized to load user antti
org.graylog2.restclient.lib.APIException: API call failed GET http://@x.x.x.x:12900/users/antti returned 401 Unauthorized body:

But the funny thing is I am not logged in with that user and nor the actual user is trying to access graylog.

nilroy commented 9 years ago

Another major problem is if I ran a very big query and the try to stop that its not possible. Once the query is made it goes to elasticsearch and even after I dismiss the quick values the query runs in background and the errors are coming in the web interface and I could see timeout exceptions from the graylog-web log..


2015-06-18T06:55:24.787Z - [ERROR] - from org.graylog2.restclient.lib.ApiClient in play-akka.actor.default-dispatcher-34 REST call failed java.util.concurrent.TimeoutException: Idle connection timeout to /x.x.x.x:12900 of 60000 ms at com.ning.http.client.providers.netty.timeout.TimeoutTimerTask.expire(TimeoutTimerTask.java:43) ~[com.ning.async-http-client-1.8.14.jar:na] at com.ning.http.client.providers.netty.timeout.IdleConnectionTimeoutTimerTask.run(IdleConnectionTimeoutTimerTask.java:54) ~[com.ning.async-http-client-1.8.14.jar:na] at org.jboss.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:556) ~[io.netty.netty-3.9.8.Final.jar:na] at org.jboss.netty.util.HashedWheelTimer$HashedWheelBucket.expireTimeouts(HashedWheelTimer.java:632) ~[io.netty.netty-3.9.8.Final.jar:na] at org.jboss.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:369) ~[io.netty.netty-3.9.8.Final.jar:na] at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) ~[io.netty.netty-3.9.8.Final.jar:na] at java.lang.Thread.run(Thread.java:745) ~[na:1.7.0_75]


2015-06-18T07:00:36.291Z - [ERROR] - from org.graylog2.restclient.lib.ApiClient in pool-71-thread-1 API call failed to execute. java.util.concurrent.ExecutionException: java.util.concurrent.TimeoutException: No response received after 5000 at com.ning.http.client.providers.netty.NettyResponseFuture.get(NettyResponseFuture.java:266) ~[com.ning.async-http-client-1.8.14.jar:na] at org.graylog2.restclient.lib.ApiClientImpl$ApiRequestBuilder.executeOnAll(ApiClientImpl.java:605) ~[org.graylog2.graylog2-rest-client--1.1.2-1.1.2.jar:na] at controllers.api.MetricsController$PollingJob.run(MetricsController.java:117) [graylog-web-interface.graylog-web-interface-1.1.2.jar:1.1.2] at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) [na:1.7.0_75] at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304) [na:1.7.0_75] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178) [na:1.7.0_75] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293) [na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) [na:1.7.0_75] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) [na:1.7.0_75] at java.lang.Thread.run(Thread.java:745) [na:1.7.0_75] Caused by: java.util.concurrent.TimeoutException: No response received after 5000 at com.ning.http.client.providers.netty.NettyResponseFuture.get(NettyResponseFuture.java:260) ~[com.ning.async-http-client-1.8.14.jar:na] ... 9 common frames omitted


Any idea?

kroepke commented 9 years ago

This indicates that your Elasticsearch cluster is not fast enough, so the requests time out (60 seconds is the timeout for searches) Please check that your hardware is sized adequately.