Hey, I am trying to use this library to train a binary classifier over a spark dataframe.In that, I keep getting a worker node failed error due to no module named sparktorch found, although I have successfully installed sparktorch library using pip.This is the error I receive:
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe.
: org.apache.spark.SparkException: Job aborted due to stage failure: Could not recover from a failed barrier ResultStage. Most recent failure reason: Stage failed because barrier task ResultTask(1, 19) finished unsuccessfully.
org.apache.spark.api.python.PythonException: Traceback (most recent call last):
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 586, in main
func, profiler, deserializer, serializer = read_command(pickleSer, infile)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 71, in read_command
command = serializer.loads(command.value)
File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 430, in loads
return pickle.loads(obj, encoding=encoding)
ModuleNotFoundError: No module named 'sparktorch'
Hey, I am trying to use this library to train a binary classifier over a spark dataframe.In that, I keep getting a worker node failed error due to no module named sparktorch found, although I have successfully installed sparktorch library using pip.This is the error I receive:
Py4JJavaError: An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : org.apache.spark.SparkException: Job aborted due to stage failure: Could not recover from a failed barrier ResultStage. Most recent failure reason: Stage failed because barrier task ResultTask(1, 19) finished unsuccessfully. org.apache.spark.api.python.PythonException: Traceback (most recent call last): File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 586, in main func, profiler, deserializer, serializer = read_command(pickleSer, infile) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/worker.py", line 71, in read_command command = serializer.loads(command.value) File "/usr/lib/spark/python/lib/pyspark.zip/pyspark/serializers.py", line 430, in loads return pickle.loads(obj, encoding=encoding) ModuleNotFoundError: No module named 'sparktorch'
if anyone can help me out, please respond