Open ecormaksin opened 1 year ago
100% this, I just did the same thing. Upvote!
I'd even take it further to provide a single copy paste example, or two, one for bash, the other for python
pip install pyspark==3.4.1 delta-spark==2.4.0
pyspark --packages io.delta:delta-core_2.12:2.4.0 --conf "spark.sql.extensions=io.delta.sql.DeltaSparkSessionExtension" --conf "spark.sql.catalog.spark_catalog=org.apache.spark.sql.delta.catalog.DeltaCatalog" << EOF
import pyspark
from delta import *
builder = pyspark.sql.SparkSession.builder.appName("MyApp") \
.config("spark.sql.extensions", "io.delta.sql.DeltaSparkSessionExtension") \
.config("spark.sql.catalog.spark_catalog", "org.apache.spark.sql.delta.catalog.DeltaCatalog")
spark = configure_spark_with_delta_pip(builder).getOrCreate()
spark.sql("CREATE OR REPLACE TABLE mytable_delta(id BIGINT) USING DELTA")
spark.range(5).write.format('delta').mode('append').saveAsTable("mytable_delta")
spark.read.table("mytable_delta").show()
spark.sql("DESCRIBE TABLE mytable_delta").show()
EOF
I have tried the PySpark Shell.
I executed
pip install pyspark==3.4.1
. However, I missed the instruction ofpip install delta-spark==2.4.0
described later.Without delta-spark, I encountered the following error.
This is why I think that installing delta-spark instruction should be described nearby pyspark.