Closed knadigatla closed 3 years ago
you can either add the deeque jar to the spark jar path or use maven to download it as part of your glue code (see one of the examples in the tutorial folder):
spark = (SparkSession
.builder
.config("spark.driver.extraClassPath", classpath)
.config("spark.jars.packages", pydeequ.deequ_maven_coord)
.config("spark.jars.excludes", pydeequ.f2j_maven_coord)
.getOrCreate())
Apologies for the delay, but you can also follow this blogpost we recently published for running PyDeequ on AWS Glue!
We are trying to use python-deequ in glue spark job with
--additional-python-modules pydeequ==0.1.5
and code im trying to execute is belowERROR LOGS:
we are using Glue Version 2.0.