LucaCanali / sparkMeasure

This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination of Spark metrics, making it a practical choice for both developers and data engineers.
Apache License 2.0
693 stars 144 forks source link

Dropping SparkListenerEvent because no remaining room in event queue #7

Closed bteeuwen closed 6 years ago

bteeuwen commented 6 years ago

I launched sparkMeasure in a large job. Immediately I got: 18/02/07 08:21:56 ERROR org.apache.spark.scheduler.LiveListenerBus: Dropping SparkListenerEvent because no remaining room in event queue. This likely means one of the SparkListeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler. 18/02/07 08:21:56 WARN org.apache.spark.scheduler.LiveListenerBus: Dropped 1 SparkListenerEvents since Thu Jan 01 00:00:00 UTC 1970 18/02/07 08:22:56 WARN org.apache.spark.scheduler.LiveListenerBus: Dropped 13971 SparkListenerEvents since Wed Feb 07 08:21:56 UTC 2018 18/02/07 08:23:51 ERROR org.apache.spark.network.server.TransportRequestHandler: Error opening block StreamChunkId{streamId=1999850815777, chunkIndex=0} for request from /10.205.151.192:37514 org.apache.spark.storage.BlockNotFoundException: Block broadcast_32_piece0 not found

The job does continue, but it seems to be overloading the listenerbus. I'll try --conf spark.scheduler.listenerbus.eventqueue.size=100000.

Did you already encounter this somewhere?

LucaCanali commented 6 years ago

Thanks for reporting this. I have not yet found such an issue. I'll be interested to know more details if you further drill down on this.

sgupta-tech commented 2 years ago

22/09/05 08:44:49 INFO$: writing output... 22/09/05 08:46:03 ERROR AsyncEventQueue: Dropping event from queue appStatus. This likely means one of the listeners is too slow and cannot keep up with the rate at which tasks are being started by the scheduler. 22/09/05 08:46:03 WARN AsyncEventQueue: Dropped 1 events from appStatus since Thu Jan 01 00:00:00 UTC 1970. 22/09/05 08:46:06 ERROR TaskSetManager: Total size of serialized results of 14548 tasks (1024.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB) 22/09/05 08:46:06 ERROR TaskSetManager: Total size of serialized results of 14549 tasks (1024.1 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)