-
### URL
https://grafana.com/docs/alloy/latest/reference/components/mimir/mimir.rules.kubernetes/
### Feedback
alloy-infra-nrg9d alloy ts=2024-08-12T07:09:20.593144112Z level=error msg="starti…
-
When I try to run data parallel on single machine with 2 GPUs, the following error happened.
```
NCCL version 2.7.8+cuda11.0
xxxxx:2573:2612 [1] graph/xml.cc:332 NCCL WARN Could not find real pat…
-
**Detail message:**
Traceback (most recent call last):
File "/user_work_path/miniconda3/envs/lwm/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_global…
-
This is long live issue that tracks limitations of mini-assistant implementation.
General speaking, `mini-assistant` is an all-in-one, single-node jukebox that mimick OpenAI's Assistant API. It's no…
-
Currently the work queue logic means that if you have multiple groups you wish to fade, they will do so one group at a time. Instead, the work should be distributed evenly.
-
While working on #4034, we considered cases where a COPY command might nest another call other COPY commands (or any other command basically). I traced back to 9.2, and I can reproduce until there.
…
-
Production KB will always involve several instances, and today distributed synchronization is achieved through the use of a [distributed lock from the SQL engine](https://github.com/killbill/killbill-…
-
### 🚀 The feature, motivation and pitch
Currently, the FX graph tracing (such as the one used in `aot_module`) seems not supporting collective functions such as `allgather`.
We may face some err…
-
Gateway works with multiple underlying (implementing) services. To analyze its performance it would be nice to support OpenTracing (https://medium.com/opentracing/towards-turnkey-distributed-tracing-5…
-
**Describe the issue**:
When shutting down a UCX cluster with GIL contention monitoring enabled (i.e. `gilknocker` is installed and `distributed.admin.system-monitor.gil.enabled=true`), we get so…