Open Lamonkey opened 3 years ago
I currently set Spark version from 3.0.0 to 2.4.4 on jupytor notebook
!pip install pyspark==2.4.4
and run the code below
!spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8_2.11:2.4.4 test.py
test.py is the file that run spark with kafka.
It still has some problems inside but you can try if it works out.
Spark Streaming's Kafka libraries not found in class path. Try one of the following.
Include the Kafka library and its dependencies with in the spark-submit command as
$ bin/spark-submit --packages org.apache.spark:spark-streaming-kafka-0-8:2.4.4 ...
Download the JAR of the artifact from Maven Central http://search.maven.org/, Group Id = org.apache.spark, Artifact Id = spark-streaming-kafka-0-8-assembly, Version = 2.4.4. Then, include the jar in the spark-submit command as
$ bin/spark-submit --jars ...