knockdata / spark-highcharts

Support Highcharts in Apache Zeppelin
Apache License 2.0
81 stars 14 forks source link

spark-highcharts not plotting with structured streaming DataFrame #29

Open alzuabi opened 7 years ago

alzuabi commented 7 years ago

Hi I'm trying to use spark-highcharts with structured streaming DataFrame as below (spark 2.1.0, zeppelin 0.7):

import com.knockdata.spark.highcharts._
import com.knockdata.spark.highcharts.model._
import org.apache.spark.sql.SparkSession
val spark = SparkSession
      .builder()
      .appName("Spark structured streaming Kafka example")
      .master("yarn")
      .getOrCreate()

    val inputstream = spark.readStream
        .format("kafka")
        .option("kafka.bootstrap.servers", "n11.hdp.com:6667,n12.hdp.com:6667,n13.hdp.com:6667 ,n10.hdp.com:6667, n9.hdp.com:6667")
        .option("subscribe", "st")
        .load()
        spark.conf.set("spark.sql.streaming.checkpointLocation", "checkpoint")

    val ValueString = inputstream.selectExpr("CAST( value AS STRING)").as[(String)]
                      .select(
                        expr("(split(value, ','))[1]").cast("string").as("GSM"),
                        expr("(split(value, ','))[7]").cast("double").as("Duration"),
                        expr("(split(value, ','))[10]").cast("double").as("DataUpLink1"),
                        expr("(split(value, ','))[11]").cast("double").as("DataDownLink1"),
                        )
                        .filter("GSM is not null and  DataUpLink1 is not null and DataDownLink1 is not null and Duration is not null")
                        .groupBy("GSM").agg(sum("DataUpLink1") as "upload",sum("DataDownLink1")  as "download", sum("Duration") as "duration")
    val query = highcharts(
  ValueString.seriesCol("GSM")
    .series("y" -> "download","x" -> "duration")
    .orderBy(col("GSM")), z, "complete")
StreamingChart(z)
query.awaitTermination()

there is no Errors, but it doesn't plot any thing and the result as below: import com.knockdata.spark.highcharts. import com.knockdata.spark.highcharts.model. import org.apache.spark.sql.SparkSession spark: org.apache.spark.sql.SparkSession = org.apache.spark.sql.SparkSession@67944ade inputstream: org.apache.spark.sql.DataFrame = [key: binary, value: binary ... 5 more fields] ValueString: org.apache.spark.sql.DataFrame = [GSM: string, upload: double ... 2 more fields] query: org.apache.spark.sql.streaming.StreamingQuery = org.apache.spark.sql.execution.streaming.StreamingQueryWrapper@249386bd null batchId: 0, chartId: 0db9bc26_675d_4a0a_9e41_5760277ec480, chartParagraphId: 20170623-210348_306834112 run 20170623-210348_306834112 batchId: 1, chartId: 0db9bc26_675d_4a0a_9e41_5760277ec480, chartParagraphId: 20170623-210348_306834112 run 20170623-210348_306834112

rockie-yang commented 7 years ago

are you put StreamingChart(z) in the the next paragraph?