Open bruce32118 opened 3 years ago
I am experiencing the same issue. once the deequ run is complete and the results are gathered, pydeequ seems to keep something running in background. closing spark session doesn't help.
EDIT: spark.sparkContext._gateway.close() spark.stop()
does the job.
Thanks for the suggestion! We will add this feature into the next release :)
@gucciwang @jaoanan1126 , any update here? Running hasSize checks from within AWS Glue will not work without adding spark.sparkContext._gateway.close() manually for me. Would appreciate a cleaner solution here. Not sure though if the spark.stop() is required.
For whoever reading the issue here, I think it's worth mentioning this doc https://pydeequ.readthedocs.io/en/latest/README.html#wrapping-up
spark.sparkContext._gateway.close()
spark.stop()
Not working for me
Is your feature request related to a problem? Please describe.
spark-submit job is not exiting until I hit ctrl+C if I use Check function
Describe the solution you'd like
Check function would start a self._spark_session.sparkContext._gateway server. If my code is completed, the server is still hang on that causes the spark job can't stop normally.
Adding a close function to class Check
def close(self): self._spark_session.sparkContext._gateway.close() return