[2024-09-04T05:45:42.922Z] FAILED tests/test_approximate_nearest_neighbors.py::test_cagra[float32-(10000, 50)-cagra-array-10000-algo_params0-sqeuclidean] - pyspark.errors.exceptions.captured.PythonException:
[2024-09-04T05:45:42.922Z] An exception was thrown from the Python worker. Please see the stack trace below.
[2024-09-04T05:45:42.922Z] Traceback (most recent call last):
[2024-09-04T05:45:42.922Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 830, in main
[2024-09-04T05:45:42.922Z] process()
[2024-09-04T05:45:42.922Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 822, in process
[2024-09-04T05:45:42.922Z] serializer.dump_stream(out_iter, outfile)
[2024-09-04T05:45:42.922Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 345, in dump_stream
[2024-09-04T05:45:42.922Z] return ArrowStreamSerializer.dump_stream(self, init_stream_yield_batches(), stream)
[2024-09-04T05:45:42.922Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 86, in dump_stream
[2024-09-04T05:45:42.922Z] for batch in iterator:
[2024-09-04T05:45:42.922Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 338, in init_stream_yield_batches
[2024-09-04T05:45:42.922Z] for series in iterator:
[2024-09-04T05:45:42.922Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 519, in func
[2024-09-04T05:45:42.922Z] for result_batch, result_type in result_iter:
[2024-09-04T05:45:42.922Z] File "/home/jenkins/agent/workspace/jenkins-spark-rapids-ml_nightly-483/python/src/spark_rapids_ml/core.py", line 1388, in _transform_udf
[2024-09-04T05:45:42.922Z] cuml_instance = construct_cuml_object_func()
[2024-09-04T05:45:42.922Z] File "/home/jenkins/agent/workspace/jenkins-spark-rapids-ml_nightly-483/python/src/spark_rapids_ml/knn.py", line 1402, in _construct_sgnn
[2024-09-04T05:45:42.922Z] from cuvs.neighbors import cagra
[2024-09-04T05:45:42.922Z] ModuleNotFoundError: No module named 'cuvs'
[2024-09-04T05:45:42.922Z]
[2024-09-04T05:45:42.922Z]
[2024-09-04T05:45:42.922Z] JVM stacktrace:
[2024-09-04T05:45:42.923Z] org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 267.0 failed 1 times, most recent failure: Lost task 0.0 in stage 267.0 (TID 153) (cuml-build-jenkins-spark-rapids-ml-nightly-483-30jw6-mkcrf executor driver): org.apache.spark.api.python.PythonException: Traceback (most recent call last):
[2024-09-04T05:45:42.923Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 830, in main
[2024-09-04T05:45:42.923Z] process()
[2024-09-04T05:45:42.923Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 822, in process
[2024-09-04T05:45:42.923Z] serializer.dump_stream(out_iter, outfile)
[2024-09-04T05:45:42.923Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 345, in dump_stream
[2024-09-04T05:45:42.923Z] return ArrowStreamSerializer.dump_stream(self, init_stream_yield_batches(), stream)
[2024-09-04T05:45:42.923Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 86, in dump_stream
[2024-09-04T05:45:42.923Z] for batch in iterator:
[2024-09-04T05:45:42.923Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 338, in init_stream_yield_batches
[2024-09-04T05:45:42.923Z] for series in iterator:
[2024-09-04T05:45:42.923Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 519, in func
[2024-09-04T05:45:42.923Z] for result_batch, result_type in result_iter:
[2024-09-04T05:45:42.923Z] File "/home/jenkins/agent/workspace/jenkins-spark-rapids-ml_nightly-483/python/src/spark_rapids_ml/core.py", line 1388, in _transform_udf
[2024-09-04T05:45:42.923Z] cuml_instance = construct_cuml_object_func()
[2024-09-04T05:45:42.923Z] File "/home/jenkins/agent/workspace/jenkins-spark-rapids-ml_nightly-483/python/src/spark_rapids_ml/knn.py", line 1402, in _construct_sgnn
[2024-09-04T05:45:42.923Z] from cuvs.neighbors import cagra
[2024-09-04T05:45:42.923Z] ModuleNotFoundError: No module named 'cuvs'
[2024-09-04T05:45:42.923Z]
[2024-09-04T05:45:42.923Z] at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:561)
[2024-09-04T05:45:42.923Z] at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:118)
[2024-09-04T05:45:42.923Z] at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:514)
[2024-09-04T05:45:42.923Z] at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
[2024-09-04T05:45:42.923Z] at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
[2024-09-04T05:45:42.923Z] at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
[2024-09-04T05:45:42.923Z] at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.sort_addToSorter_0$(Unknown Source)
[2024-09-04T05:45:42.923Z] at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source)
[2024-09-04T05:45:42.923Z] at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
[2024-09-04T05:45:42.923Z] at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
[2024-09-04T05:45:42.923Z] at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
[2024-09-04T05:45:42.923Z] at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:891)
[2024-09-04T05:45:42.923Z] at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:891)
[2024-09-04T05:45:42.923Z] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
[2024-09-04T05:45:42.923Z] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
[2024-09-04T05:45:42.923Z] at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
[2024-09-04T05:45:42.923Z] at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.Task.run(Task.scala:139)
[2024-09-04T05:45:42.923Z] at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
[2024-09-04T05:45:42.923Z] at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
[2024-09-04T05:45:42.923Z] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
[2024-09-04T05:45:42.923Z] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[2024-09-04T05:45:42.923Z] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[2024-09-04T05:45:42.923Z] at java.lang.Thread.run(Thread.java:750)
[2024-09-04T05:45:42.923Z]
[2024-09-04T05:45:42.923Z] Driver stacktrace:
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2790)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2726)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2725)
[2024-09-04T05:45:42.923Z] at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
[2024-09-04T05:45:42.923Z] at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
[2024-09-04T05:45:42.923Z] at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:2725)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1211)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1211)
[2024-09-04T05:45:42.923Z] at scala.Option.foreach(Option.scala:407)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1211)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2989)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2928)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2917)
[2024-09-04T05:45:42.923Z] at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
[2024-09-04T05:45:42.923Z] at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:976)
[2024-09-04T05:45:42.924Z] at org.apache.spark.SparkContext.runJob(SparkContext.scala:2263)
[2024-09-04T05:45:42.924Z] at org.apache.spark.SparkContext.runJob(SparkContext.scala:2284)
[2024-09-04T05:45:42.924Z] at org.apache.spark.SparkContext.runJob(SparkContext.scala:2303)
[2024-09-04T05:45:42.924Z] at org.apache.spark.SparkContext.runJob(SparkContext.scala:2328)
[2024-09-04T05:45:42.924Z] at org.apache.spark.rdd.RDD.$anonfun$collect$1(RDD.scala:1022)
[2024-09-04T05:45:42.924Z] at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
[2024-09-04T05:45:42.924Z] at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
[2024-09-04T05:45:42.924Z] at org.apache.spark.rdd.RDD.withScope(RDD.scala:408)
[2024-09-04T05:45:42.924Z] at org.apache.spark.rdd.RDD.collect(RDD.scala:1021)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:448)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$executeCollect$1(AdaptiveSparkPlanExec.scala:360)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.withFinalPlanUpdate(AdaptiveSparkPlanExec.scala:388)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:360)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.Dataset.$anonfun$collectToPython$1(Dataset.scala:4038)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.Dataset.$anonfun$withAction$2(Dataset.scala:4208)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:526)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:4206)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:118)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:195)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:103)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:827)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.Dataset.withAction(Dataset.scala:4206)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.Dataset.collectToPython(Dataset.scala:4035)
[2024-09-04T05:45:42.924Z] at sun.reflect.GeneratedMethodAccessor104.invoke(Unknown Source)
[2024-09-04T05:45:42.924Z] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
[2024-09-04T05:45:42.924Z] at java.lang.reflect.Method.invoke(Method.java:498)
[2024-09-04T05:45:42.924Z] at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
[2024-09-04T05:45:42.924Z] at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
[2024-09-04T05:45:42.924Z] at py4j.Gateway.invoke(Gateway.java:282)
[2024-09-04T05:45:42.924Z] at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
[2024-09-04T05:45:42.924Z] at py4j.commands.CallCommand.execute(CallCommand.java:79)
[2024-09-04T05:45:42.924Z] at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
[2024-09-04T05:45:42.924Z] at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
[2024-09-04T05:45:42.924Z] at java.lang.Thread.run(Thread.java:750)
[2024-09-04T05:45:42.924Z] Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last):
[2024-09-04T05:45:42.924Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 830, in main
[2024-09-04T05:45:42.924Z] process()
[2024-09-04T05:45:42.924Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 822, in process
[2024-09-04T05:45:42.924Z] serializer.dump_stream(out_iter, outfile)
[2024-09-04T05:45:42.924Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 345, in dump_stream
[2024-09-04T05:45:42.924Z] return ArrowStreamSerializer.dump_stream(self, init_stream_yield_batches(), stream)
[2024-09-04T05:45:42.924Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 86, in dump_stream
[2024-09-04T05:45:42.924Z] for batch in iterator:
[2024-09-04T05:45:42.924Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/sql/pandas/serializers.py", line 338, in init_stream_yield_batches
[2024-09-04T05:45:42.924Z] for series in iterator:
[2024-09-04T05:45:42.924Z] File "/root/miniconda3/lib/python3.9/site-packages/pyspark/python/lib/pyspark.zip/pyspark/worker.py", line 519, in func
[2024-09-04T05:45:42.924Z] for result_batch, result_type in result_iter:
[2024-09-04T05:45:42.924Z] File "/home/jenkins/agent/workspace/jenkins-spark-rapids-ml_nightly-483/python/src/spark_rapids_ml/core.py", line 1388, in _transform_udf
[2024-09-04T05:45:42.924Z] cuml_instance = construct_cuml_object_func()
[2024-09-04T05:45:42.924Z] File "/home/jenkins/agent/workspace/jenkins-spark-rapids-ml_nightly-483/python/src/spark_rapids_ml/knn.py", line 1402, in _construct_sgnn
[2024-09-04T05:45:42.924Z] from cuvs.neighbors import cagra
[2024-09-04T05:45:42.924Z] ModuleNotFoundError: No module named 'cuvs'
[2024-09-04T05:45:42.924Z]
[2024-09-04T05:45:42.924Z] at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:561)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:118)
[2024-09-04T05:45:42.924Z] at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:514)
[2024-09-04T05:45:42.924Z] at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
[2024-09-04T05:45:42.924Z] at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:491)
[2024-09-04T05:45:42.924Z] at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.sort_addToSorter_0$(Unknown Source)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.processNext(Unknown Source)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
[2024-09-04T05:45:42.924Z] at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:760)
[2024-09-04T05:45:42.925Z] at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:388)
[2024-09-04T05:45:42.925Z] at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:891)
[2024-09-04T05:45:42.925Z] at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:891)
[2024-09-04T05:45:42.925Z] at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
[2024-09-04T05:45:42.925Z] at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:367)
[2024-09-04T05:45:42.925Z] at org.apache.spark.rdd.RDD.iterator(RDD.scala:331)
[2024-09-04T05:45:42.925Z] at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:92)
[2024-09-04T05:45:42.925Z] at org.apache.spark.TaskContext.runTaskWithListeners(TaskContext.scala:161)
[2024-09-04T05:45:42.925Z] at org.apache.spark.scheduler.Task.run(Task.scala:139)
[2024-09-04T05:45:42.925Z] at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:554)
[2024-09-04T05:45:42.925Z] at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1529)
[2024-09-04T05:45:42.925Z] at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:557)
[2024-09-04T05:45:42.925Z] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
[2024-09-04T05:45:42.925Z] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
[2024-09-04T05:45:42.925Z] ... 1 more
Failed job: spark-rapids-ml_nightly/483 Failed case:
Errors: