Closed aeneaswiener closed 7 years ago
@aeneaswiener Does this error manifest when running without the es-hadoop libraries? Curator is a dependency of Spark. ES-Hadoop makes no calls to it explicitly.
@jbaiera the error only appears once I request for ES-Hadoop to be installed by passing it via the --packages
flag.
I have no issues running Spark jobs unless I need to use the ES-Hadoop library.
One workaround I have found is just copying another jar into the location of the missing jar. The Pyspark shell starts up fine after that and ES-Hadoop is usable for everything I have tried so far. It is a temporary ugly workaround but obviously not a fix.
This feels more like an issue with Spark to be honest. Looking through the logs it shows that it was able to successfully find the curator artifact in central and resolve it. It's disappearance from the local ivy repository seems to be the root of the problem. It's possible that there's a problem with Spark's package deployment. Closing this for now.
What kind an issue is this?
The easier it is to track down the bug, the faster it is solved.
Often a solution already exists! Don’t send pull requests to implement new features without first getting our support. Sometimes we leave features out on purpose to keep the project small.
Issue description
When installing as a Spark package I am getting a warning because of a missing Jar file (
org.apache.curator_apache-curator-2.6.0.jar
). The PySpark driver output contains an error mentioning a missing jar. This is preventing the PySpark shell to start up successfully.Steps to reproduce
org.apache.curator_apache-curator-2.6.0.jar
):Version Info
Full stack trace: