apache / incubator-livy

Apache Livy is an open source REST interface for interacting with Apache Spark from anywhere.
https://livy.apache.org/
Apache License 2.0
890 stars 602 forks source link

The SparkStreaming operator fails to execute #413

Closed zenvzenv closed 1 month ago

zenvzenv commented 1 year ago

kafka as SparkStreaming input and output

  1. Use spark.readStream.format("kafka") read kafka data and decode binary data to string
  2. Use df.map(_.Seq.foldLeft(""))(_ + separtor + _).writeStream("kafka") output data to kafka
  3. If I fail to output to kafka, then no matter how I change the kafka topic later, the stream computing will fail,ArrayIndexOutOfBoundsException: 1 exception will report.If I only output to the console there will be no error
  4. If I run the same code snippet directly in spark-shell without using livy, the effect is the same as 3