awslabs / python-deequ

Python API for Deequ
Apache License 2.0
702 stars 132 forks source link

requirements.txt pyspark>=2.4.7 is installing pyspark 3.0 which is causing issues with my pyspark 2.x job #6

Closed cfregly closed 3 years ago

cfregly commented 3 years ago

Symptom: couldn't write to S3 from within a SageMaker Processing Job.

I had to switch to pip install --no-deps pyspark-deequ to avoid upgrading pyspark to 3.0.1.

Perhaps cap the version of pyspark defined in requirements.txt to include <3.0?

Note: I saw this issue on 1.0.2, but upgraded to 1.0.5 while testing. Not sure if that matters. Seems like the pyspark version was the issue.

raphaelauv commented 3 years ago

It's fixed

https://github.com/awslabs/python-deequ/blob/d2e6e891528ab6dfe3cc0185c974dacc5ec78c97/requirements.txt#L1

@cfregly you can close this issue

gucciwang commented 3 years ago

Thanks!