Closed ninjapapa closed 5 years ago
The issue is actually caused on the Scala side, which might be rooted on the fact that the smv jar was built on vanilla spark 1.6.2 jar, when run on CDH spark 1.6.0 the py4j related api doesn't match. Need to find the CDH jars to try to build SMV again.
New release 1.6.2.4-p4 created to fixed this.
Get the following error when run ls()
command in shell:
>>> ls()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/users/zhangb32/SMV_1.6.2.4.p4/src/main/python/smv/smvshell.py", line 141, in ls
print(_jvmShellCmd().ls())
File "/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line 1257, in __cal
l__
File "/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib/spark/python/pyspark/sql/utils.py", line 45, in deco
return f(*a, **kw)
File "/opt/cloudera/parcels/CDH-5.15.2-1.cdh5.15.2.p0.3/lib/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line 328, in get_return
_value
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.tresamigos.smv.shell.ShellCmd.ls.
py4j.protocol.Py4JJavaError: An error occurred while calling z:org.tresamigos.smv.shell.ShellCmd.ls.
: java.io.IOException: Authentication with callback server unsuccessful.
at py4j.NetworkUtil.authToServer(NetworkUtil.java:151)
at py4j.CallbackConnection.start(CallbackConnection.java:234)
at py4j.CallbackClient.getConnection(CallbackClient.java:238)
at py4j.CallbackClient.getConnectionLock(CallbackClient.java:250)
at py4j.CallbackClient.sendCommand(CallbackClient.java:377)
at py4j.CallbackClient.sendCommand(CallbackClient.java:356)
at py4j.reflection.PythonProxyHandler.invoke(PythonProxyHandler.java:106)
at com.sun.proxy.$Proxy32.getCreateRepo(Unknown Source)
at org.tresamigos.smv.DataSetRepoFactoryPython.createRepo(DataSetRepo.scala:68)
at org.tresamigos.smv.DataSetRepoFactoryPython.createRepo(DataSetRepo.scala:65)
at org.tresamigos.smv.TX$$anonfun$1.apply(Transaction.scala:29)
at org.tresamigos.smv.TX$$anonfun$1.apply(Transaction.scala:29)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
at scala.collection.immutable.List.foreach(List.scala:318)
at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
at scala.collection.AbstractTraversable.map(Traversable.scala:105)
at org.tresamigos.smv.TX.<init>(Transaction.scala:29)
at org.tresamigos.smv.DataSetMgr.withTX(DataSetMgr.scala:43)
...
With some digging into py4j's source code, it sounds that when the callback server is initiated, a non-none authToken
is given, which told java side client of the callback server to provide auth token.
Since Spark 2.1 already changed to py4j 0.10.x, it is likely we solved this problem in some SMV 2.1 version.
Found the commit which fixed this problem in the past: 63a2746394e2ee4d76b98e4913f4d69ead1b50fc
Since there are other code in that commit, instead of doing a cherry-pick, will simply copy-paste the relevant code and create a new commit.
Fixed in the p5 release.
Run SMV-1.6.2.4-p4 on cdh5.15.2's Spark 1.6.0 get into the following error:
Noticed that CDH has py4j-0.10 version while the spark-1.6.2 has py4j-0.09 version.