Lamonkey / streamProcessing

Big data analysis p1
1 stars 0 forks source link

Spark Streaming's Kafka libraries not found in class path. Try one of the following. #7

Open Lamonkey opened 3 years ago

Lamonkey commented 3 years ago

Spark Streaming's Kafka libraries not found in class path. Try one of the following.

  1. Include the Kafka library and its dependencies with in the spark-submit command as

    $ bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8:2.4.4 ...

  2. Download the JAR of the artifact from Maven Central http://search.maven.org/, Group Id = org.apache.spark, Artifact Id = spark-streaming-kafka-0-8-assembly, Version = 2.4.4. Then, include the jar in the spark-submit command as

    $ bin/spark-submit --jars ...

Tony-Hsieh commented 3 years ago

I currently set Spark version from 3.0.0 to 2.4.4 on jupytor notebook !pip install pyspark==2.4.4 and run the code below !spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.4 test.py test.py is the file that run spark with kafka. It still has some problems inside but you can try if it works out.