FRosner / drunken-data-quality

Spark package for checking data quality
Apache License 2.0
222 stars 69 forks source link

PR: PySpark API (issue/91-2) #103

Closed FRosner closed 8 years ago

FRosner commented 8 years ago

Fixes #91 Depends on #102

codecov-io commented 8 years ago

Current coverage is 100% (diff: 100%)

Merging #103 into master will not change coverage

@@            master   #103   diff @@
=====================================
  Files           24     24          
  Lines          437    437          
  Methods        429    421     -8   
  Messages         0      0          
  Branches         8     16     +8   
=====================================
  Hits           437    437          
  Misses           0      0          
  Partials         0      0          

Powered by Codecov. Last update e2510c3...e1a98bc

FRosner commented 8 years ago

@Gerrrr when running your example using --packages or --jars I get

>>> check = Check(df)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "pyddq/core.py", line 33, in __init__
    self._jvm_display_name,
  File "pyddq/core.py", line 46, in _jvm_display_name
    "apply$default$2"
  File "/Users/frosner/Documents/projects/spark-1.5.2-bin-hadoop2.6/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py", line 726, in __getattr__
py4j.protocol.Py4JError: Trying to call a package.

It works if I use --driver-class-path instead. Maybe a problem in Spark?