gearpump / gearpump

Lightweight real-time big data streaming engine over Akka
https://gearpump.github.io/gearpump/
Apache License 2.0
763 stars 152 forks source link

Large application jar causes master to exit #1204

Closed kkasravi closed 9 years ago

kkasravi commented 9 years ago

Story FileServer will exhaust java heap space due to attempting to buffer the entire jar into a byte array

kkasravi commented 9 years ago

From the log

2015-07-24 01:37:10,132 INFO AppManager: Application Manager started. Ready for application submission...
2015-07-24 01:37:10,840 INFO FileServer: FileServer bound on port: 36336
2015-07-24 01:37:10,840 INFO HttpListener: Bound to ip-10-10-10-217.us-west-2.compute.internal/10.10.10.217:0
2015-07-24 01:37:11,106 INFO ClusterSingletonProxy: Singleton identified: akka://master/user/singleton/masterwatcher/master
2015-07-24 01:37:11,109 INFO PriorityScheduler: Worker 887953828 added to the scheduler
2015-07-24 01:37:11,110 INFO Master: Register Worker 887953828 from ip-10-10-10-130.us-west-2.compute.internal ....
2015-07-24 01:37:11,110 INFO Master: Register Worker 804379173 from ip-10-10-10-37.us-west-2.compute.internal ....
2015-07-24 01:37:11,110 INFO PriorityScheduler: Worker 804379173 added to the scheduler
2015-07-24 01:37:11,132 INFO PriorityScheduler: ResourceUpdate(Actor[akka.tcp://27cb1fd7-d0a7-48db-8f7b-e85f7ac41caf@ip-10-10-10-37.us-west-2.compute.internal:58971/user/Worker27cb1fd7-d0a7-48db-8f7b-e85f7ac41caf#2060502574],804379173,Resource(1000))...
2015-07-24 01:37:11,163 INFO PriorityScheduler: ResourceUpdate(Actor[akka.tcp://708e8a1d-3f3c-469c-be84-9a18b0156332@ip-10-10-10-130.us-west-2.compute.internal:46323/user/Worker708e8a1d-3f3c-469c-be84-9a18b0156332#-1355333931],887953828,Resource(1000))...
2015-07-24 01:40:24,291 INFO Master$: Triggering shutdown hook....
2015-07-24 01:40:24,291 ERROR ActorSystemImpl: Uncaught error from thread [master-akka.actor.default-dispatcher-3] shutting down JVM since 'akka.jvm-exit-on-fatal-error' is enabled
java.lang.OutOfMemoryError: Java heap space
    at spray.http.HttpData$NonEmpty.toByteArray(HttpData.scala:209)
    at org.apache.gearpump.util.FileServer$$anonfun$listen$1$$anonfun$3.apply(FileServer.scala:91)
    at org.apache.gearpump.util.FileServer$$anonfun$listen$1$$anonfun$3.apply(FileServer.scala:89)
    at scala.util.Try$.apply(Try.scala:191)
    at org.apache.gearpump.util.FileServer$$anonfun$listen$1.applyOrElse(FileServer.scala:89)
    at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
    at org.apache.gearpump.util.FileServer.aroundReceive(FileServer.scala:40)
    at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
    at akka.actor.ActorCell.invoke(ActorCell.scala:487)
    at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
    at akka.dispatch.Mailbox.run(Mailbox.scala:220)
    at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
    at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
    at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
    at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
    at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
2015-07-24 01:40:24,296 INFO Cluster(akka://master): Cluster Node [akka.tcp://master@ip-10-10-10-217.us-west-2.compute.internal:3000] - Marked address [akka.tcp://master@ip-10-10-10-217.us-west-2.compute.internal:3000] as [Leaving]
2015-07-24 01:40:24,297 INFO Cluster(akka://master): Cluster Node [akka.tcp://master@ip-10-10-10-217.us-west-2.compute.internal:3000] - Marking node [akka.tcp://master@ip-10-10-10-217.us-west-2.compute.internal:3000] as [Down]
2015-07-24 01:40:27,306 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon.
2015-07-24 01:40:27,308 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports.
2015-07-24 01:40:27,345 INFO RemoteActorRefProvider$RemotingTerminator: Remoting shut down.
kkasravi commented 9 years ago

See this link on streaming http://stackoverflow.com/questions/28157189/sending-large-files-with-spray

kkasravi commented 9 years ago

Note - hdfs streaming size has increased to 60M due to needing to include external_kafka. Prior size was around 32M.

kkasravi commented 9 years ago

The FileServer portion of the master should be done in a separate process to avoid memory issues impacting Master stability.

clockfly commented 9 years ago

dupliate of #1127