When we're installing pydeequ on Databricks runtime, it pulls pyspark, and this breaks environment.
It would be useful to avoid hard dependency on the pyspark - instead the findspark package could be used to find installed Spark. You can look here for possible implementations
When we're installing pydeequ on Databricks runtime, it pulls
pyspark
, and this breaks environment.It would be useful to avoid hard dependency on the
pyspark
- instead thefindspark
package could be used to find installed Spark. You can look here for possible implementations