StreamState / terraform-k8s-configuration

1 stars 1 forks source link

fix persist to gcs and write to kafka #78

Closed danielhstahl closed 3 years ago

danielhstahl commented 3 years ago

Writing to existing GCS folder (eg, if restarting a streaming app) will fail with pyspark.sql.utils.AnalysisException: path [] already exists. Write to kafka will fail with pyspark.sql.utils.AnalysisException: Required attribute 'value' not found.

The second issue is resolved here: https://stackoverflow.com/a/46454104/5673118

danielhstahl commented 3 years ago

Now getting java.lang.ClassNotFoundException: org.apache.spark.sql.kafka010.KafkaBatchInputPartition

When a topic is produced.