-
Hi,
When running python/llm/example/GPU/HuggingFace/LLM/codeshell/server.py
python server.py --checkpoint-path /home/user/Qwen2-7B-Instruct --device xpu --multi-turn --max-context 1024
40+ pyt…
-
## Describe your environment
Using the following Dockerfile to simplify testing:
```
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim
RUN pip install --no-cache-dir \
opentelemetry…
-
**Describe the issue**:
The worker profile has a limited span and older data seems to be lost. For example, with the minimal example below, the total CPU time is 24 hours, but the profile never conta…
-
### Describe the solution you'd like
At this time, all public APIs that accept `DataConverter`s require payload converter and failure converter to be specified as paths. That requirement originally…
-
The configuration builder, always calculates same values as below even if you change the number of cpu input up to 64 cores.
max_worker_processes = 8
max_parallel_workers_per_gather = 2
max_paral…
-
## Summary
This RFC proposes a change in Ray's terminology to distinguish between "worker nodes" (hardware resources) and "worker processes" (individual processes executing tasks/actors), in an effor…
-
When publishing via POST requests, it seems that some channel IDs work fine, but others do not, if the following conditions are met:
* `worker_processes 2;` (or more)
* Publisher uses **both** `ncha…
-
I need the parent thread to transform whatever the worker logs on the console, e.g.
```ts
// parent.ts
console.log("hey there, I'm the parent.");
// worker.ts
console.log("hey there, I'm a chi…
-
this will fail
```
from distributed import Worker
cluster = LocalCluster(processes=False, worker_class=Worker)
```
while this will pass
```
from distributed import Worker
cluster = Loc…
-
This thing needs a system to run deferred tasks in the background. It can be very simple at first - e.g. just an event responder that can run async in the same process as web server? Maybe better to s…