FRosner / drunken-data-quality

Spark package for checking data quality
Apache License 2.0
222 stars 69 forks source link

error while running simple example #111

Closed imalkov82 closed 7 years ago

imalkov82 commented 7 years ago

Hi

I using jupyter that is running above pyspark. First I installed pyddq with: pip2.7 install pyddq==3.2.1 (python2.7)

Then I execute simple example that ended with error (see attached pic). What I missing?

B.R Igor

pyddq_error

Gerrrr commented 7 years ago

Hi @imalkov82 !

Did you pass the jar to --driver-class-path? https://github.com/FRosner/drunken-data-quality#python-api-1

imalkov82 commented 7 years ago

Thanks for your response!

No ... Can you give me the location where I can download the drunken-data-quality_2.10-x.y.z.jar?

Igor

imalkov82 commented 7 years ago

Probably is it: https://github.com/FRosner/drunken-data-quality/releases

Thanks!

Gerrrr commented 7 years ago

You can download the jar from the release section.

Thanks for the feedback, I'll create a patch to check if the jar is available in the class path.

FRosner commented 7 years ago

How difficult is it to provide this check @Gerrrr? Shall we put it into the 4.1.0 release?

Gerrrr commented 7 years ago

It is almost done, I only have a couple of tests to fix. Yep, let's put it into the 4.1.0.