dvgodoy / handyspark

HandySpark - bringing pandas-like capabilities to Spark dataframes
MIT License
188 stars 24 forks source link

SyntaxError: only named arguments may follow *expression when use python2 #11

Closed mattshma closed 5 years ago

mattshma commented 5 years ago

When I use python2, I'll got the error as follow:

$ pyspark
Python 2.7.5 (default, Aug  4 2017, 00:39:18)
// comment:some pyspark output are omitted
Using Python version 2.7.5 (default, Aug  4 2017 00:39:18)
SparkSession available as 'spark'.
>>> import handyspark
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/python2.7/site-packages/handyspark/__init__.py", line 1, in <module>
    from handyspark.extensions.evaluation import BinaryClassificationMetrics
  File "/usr/lib/python2.7/site-packages/handyspark/extensions/__init__.py", line 2, in <module>
    from handyspark.extensions.evaluation import BinaryClassificationMetrics
  File "/usr/lib/python2.7/site-packages/handyspark/extensions/evaluation.py", line 3, in <module>
    from handyspark.plot import roc_curve, pr_curve
  File "/usr/lib/python2.7/site-packages/handyspark/plot.py", line 53
    splits = np.linspace(*sdf.agg(F.min(col), F.max(col)).rdd.map(tuple).collect()[0], n + 1)
SyntaxError: only named arguments may follow *expression

Python modules' info:

$ pip list |grep -E "spark"
handyspark (0.2.1a1)
pyspark (2.4.0)

As PEP 3132 says:

Only allow a starred expression as the last item in the exprlist. This would simplify the unpacking code a bit and allow for the starred expression to be assigned an iterator. This behavior was rejected because it would be too surprising.

This error only appear in python2.

miguel2488 commented 3 years ago

Hi, did you solve this? If so, how did you do it? I'm trying to use this package in python 2.7 with spark 2.4.0