-
amazon gives you ICE'd errors, (Insufficient Capacity Error) when trying to use on demand GPU and civo has them idle. lets see if our users can consume civo gpu's
-
I would like to confirm whether Milvus Operator can deploy a GPU-enabled Milvus cluster. My use case requires accelerated vector search and indexing using GPUs. Below are my questions:
- Does Milvu…
-
I was trying to figure out why the [wrk-99 node in nerc-ocp-prod](https://grafana.apps.obs.nerc.mghpcc.org/explore?orgId=1&left=%5B%22now-1h%22,%22now%22,%22observability-metrics%22,%7B%22exemplar%22:…
-
-
Hi, thanks for your great work.
I am following the instructions to install and run the test scripts.
I tried two systems, one with 4xA100 40G, the other with 4xA100 80G.
I use the following…
-
**Describe the bug**
Following the official examples [here](https://docs.nvidia.com/spark-rapids/user-guide/latest/examples.html#ref-sec-profcmd-cli-samples), cannot profile event logs from Dataproc …
-
I’ve been working on adding 3 new GPU servers to the Magic Castle cluster, but unfortunately, I’ve been facing multiple issues with the setup, and I’m at a bit of a standstill.
Issues Encountered:…
-
As of d17b626a8cfe7459be8ccb0a9d0c80ea29a3bb5c
Can trigger a github action that runs a script, puts logs in a github artifact and then posts the artifact results to stdout
```
(discord) ➜ disc…
-
We had an unplanned service disruption on the production cluster this Tuesday, 11/19. I wanted to document what happened because I think there are several important lessons we can learn from this inci…
-
Hello. I'm building a software need a scalable linear equation solver on cluster with multiple GPUs. The document on `linalg.solve` (https://docs.nvidia.com/cupynumeric/latest/api/generated/cupynumeri…