-
**Is your feature request related to a problem? Please describe.**
I'm using RAFT by rapids.cmake, while errors occur sometime related to cmake, it's hard to debug, usually I need to remove all build…
-
**Describe the bug**
[QualificationAppInfo searches potential issues in SQL Plan](https://github.com/NVIDIA/spark-rapids/blob/6eae4c16284ed5b9bf7ab923831d8b602d126916/tools/src/main/scala/org/apache/…
-
Currently, we use a patch to remove `nvidia-nccl-cu12` from the dependencies when building the CPU wheel:
https://github.com/dmlc/xgboost/blob/f52f11e1d7c3e2c5b065f8fca6defc818089cebc/tests/buildkite…
hcho3 updated
2 months ago
-
**Describe the bug**
Connecting `Client(synchronous=False)` to a `LocalCUDACluster(synchronous=True)` hangs. This means an async localcudacluster cannot be used with RAPIDS libraries that expect a…
-
**Describe the bug**
Observed following error while running [mig.sh](https://github.com/GoogleCloudDataproc/initialization-actions/blob/master/gpu/mig.sh) on dataproc-2.1-ubuntu20 with runtime version…
-
**Is your feature request related to a problem? Please describe.**
(A). In the current spark-rapids (0.1 and 0.2), `spark.rapids.sql.concurrentGpuTasks` and `spark.rapids.sql.batchSizeBytes` are s…
-
## Description
When trying to adapt the custom callback as described at https://ts.gluon.ai/master/tutorials/mxnet_models/trainer_callbacks.html to `DeepAREstimator`, I get a `RuntimeError`.
I gue…
-
**Is your feature request related to a problem? Please describe.**
https://github.com/NVIDIA/spark-rapids-tools/pull/1275 adds an option to filter event logs by a maximum size. This doesn't currentl…
-
**Describe the bug**
When using TPOT cuML GPU crashes after a few hours, it ran for ~5.5 hours before crashing as I attempted to reproduce this example provided by a DGX A100 customer. The specific…
-
**Describe the bug**
While working on the method for safe un-shimming in #6555 I noticed that jdeps reveals we have public->spark3xx-common dependencies that may lead to classloading bugs
```Bash…