Ronitspark / spark_issues

It is to list out all the spark issues
0 stars 0 forks source link

Spark and Kafka integration issue- Help wanted #1

Open Ronitspark opened 5 years ago

Ronitspark commented 5 years ago

Hi ,

Whenever I am writing the code in eclipse, I am getting this error . I am using scala 2.11 and the maven have below entries:

4.0.0 com.spark.stream maven-demo 0.0.1-SNAPSHOT org.apache.spark spark-hive_2.12 2.4.0 provided org.apache.spark spark-hive-thriftserver_2.12 2.4.0 provided org.apache.spark spark-core_2.12 2.4.0 org.apache.spark spark-streaming_2.11 2.1.1 org.apache.spark spark-streaming-kafka-0-10_2.12 2.4.3 org.apache.spark spark-sql_2.12 2.4.0 org.apache.maven.plugins maven-compiler-plugin 3.8.0 ${java.version} ${java.version} org.codehaus.mojo build-helper-maven-plugin 3.0.0 add-source generate-sources add-source src/main/scala add-test-source generate-test-sources add-test-source src/test/scala maven-assembly-plugin sample_package.sampleapp jar-with-dependencies make-assembly package single

Below is my code :

package com.spark.stream

// scalastyle:off println

import org.apache.kafka import org.apache.kafka.common.serialization.StringDeserializer import org.apache.spark.SparkConf import org.apache.spark.streaming. import org.apache.kafka.clients.consumer.ConsumerRecord import org.apache.kafka.common.serialization.StringDeserializer import org.apache.spark.streaming.kafka010. import org.apache.spark.streaming.kafka010.LocationStrategies.PreferConsistent import org.apache.spark.streaming.kafka010.ConsumerStrategies.Subscribe

object wordstream { def main(args: Array[String]): Unit = { if (args.length < 3) { System.err.println(s""" Usage: DirectKafkaWordCount is a list of one or more Kafka brokers is a consumer group name to consume from topics is a list of one or more kafka topics to consume from
    """.stripMargin)
  System.exit(1)
}

//StreamingExamples.setStreamingLogLevels()

val Array(brokers, groupId, topics) = args

// Create context with 2 second batch interval
val spark = SparkSession
            .builder
            .appName("wd4")
          .master("local")
            .getOrCreate()

val sparkConf = new SparkConf().setAppName("wordstream").setMaster("local")

// val ssc = new StreamingContext(sparkConf, Seconds(2)) val ssc = new StreamingContext(sparkConf, Seconds(2))

// Create direct kafka stream with brokers and topics

val topicsSet = topics.split(",").toSet

val kafkaParams = Map[String, Object](
  ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG -> brokers,
  ConsumerConfig.GROUP_ID_CONFIG -> groupId,
  ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG -> classOf[StringDeserializer],
  ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG -> classOf[StringDeserializer])

val messages = KafkaUtils.createDirectStream[String,String](ssc,LocationStrategies.PreferConsistent, ConsumerStrategies.Subscribe[String, String](topicsSet, kafkaParams))

/*val messages = KafkaUtils.createDirectStream[String, String](
  streamingContext,
  PreferConsistent,
  Subscribe[String, String](topics, kafkaParams)
)*/

// Get the lines, split them into words, count the words and print

val lines = messages.map(_.value)
val words = lines.flatMap(_.split(" "))
val wordCounts = words.map(x => (x, 1L)).reduceByKey(_ + _)
wordCounts.print()

// Start the computation

ssc.start()
ssc.awaitTermination()

}

}

// scalastyle:on println

Below is printout of error I am getting:

image

Ronitspark commented 5 years ago

@Re1tReddy

Ronitspark commented 5 years ago

@hussainasghar@Akanshgoswami@