Closed athulj closed 4 years ago
Thanks, @athulj! Well, after a deeper look I think the core of the problem is that sparkSession.stop() is never called.
If we remove the System.exit(success), we would have to make sure that all errors are signalized using exceptions. Spark uses exclusively exceptions but the option parsing code relies on return values. Just from my experience, 50% of the errors are stupid configuration errors, and also those must be caught somehow, eg. when running a series of jobs from a shell script or a workflow framework.
To address the Spark session not stopped, see 3a71762. I'll test this change until tomorrow. I'll also think about handling errors during option parsing differently to make the System.exit() obsolete.
Hi @athulj, I've successfully tested both CCIndexExport and CCIndexWarcExport using Spark on Yarn. The potential errors should be fixed now: Spark session is shut down properly and System.exit(...) is only called when there are errors regarding the command-line options. Thanks for your contribution!
@sebastian-nagel Thanks for taking the time to fix this issue! Much appreciated.
I built the spark app and tried to run it on AWS EMR. However, I encountered a strange error where all the warc files were generated twice.
Upon investigating, I found that Spark does not play nice with System.exit() and will mark the job as FAILED after retrying it once.
https://github.com/apache/spark/blob/a829234df35c87c169425f2c79fd1963b5420888/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L238
Thus, it is better to remove the System.exit() call at the end.