AbsaOSS / hyperdrive

Extensible streaming ingestion pipeline on top of Apache Spark
Apache License 2.0
44 stars 13 forks source link

Refactor SparkIngestor to a configurable component #110

Closed kevinwallimann closed 4 years ago

kevinwallimann commented 4 years ago

Currently, the SparkIngestor doesn't accept any configuration. This will be needed for long-running jobs to choose either StreamingQuery.awaitTermination or StreamingQuery.processAllAvailable.

Additionally, it's currently not possible to pass configuration options to the sparksession via the config file or command line

Tasks

kevinwallimann commented 4 years ago

Close PR because extra options for spark would be overwritten by options passed to spark-submit