-
### 🐛 Describe the bug
I read the test_transformer_training example in [pytorch/test/distributed/tensor/parallel/test_tp_examples.py,](https://github.com/pytorch/pytorch/blob/main/test/distributed/…
-
### Bug Description
I found I had to use the `col as "col?"` trick to force nullability at runtime otherwise `sqlx::query_as!()` produces "unexpected null; try decoding as an `Option` when multiple …
-
### Problem
Currently, BlockWALService persists data blocks in parallel, responding directly to the upper layer with success as soon as any data block is persisted, even if the previous data block ha…
-
### Bug Summary
A race condition in requests to join a game can cause too many players to join the game because the game doesn't appear full at the time the multiple requests to join are processed. W…
-
When fighting on a widget or clicking to inspect a widget. A lot of requests are sent to the service in order to populate the card. This causes a lot of delay in our processes that display that inform…
-
tftp-enum.nse checks for a long list of files, and often has to wait for a timeout for not-found files. Using coroutines to request many files in parallel could speed it up considerably. We should pro…
-
As per our [Slack discussion](https://seldondev.slack.com/archives/C03DQFTFXMX/p1667786578108119) with @adriangonz there is a performance overhead on MLServer in terms of received latency compared to …
-
### How would you like to use vllm
I want to run Phi-3-vision with VLLM to support parallel calls with high throughput. In my setup (openai compatible 0.5.4 VLLM server on HuggingFace Inference End…
-
- OS: **Ubuntu 22.04**
- GPUs: **2x 4090** (2x 24GB)
- CUDA: **11.8**
- CPU: **Ryzen 3800X**
- RAM: **64GB**
- vLLM build: **main** `400b8289`
Started the API server with this command:
```sh
…
-
(graphrag-ollama-local) root@autodl-container-49d843b6cc-10e9e2a3:~/graphrag-local-ollama# python -m graphrag.query --root ./ragtest --method global "What is machinelearning?"
INFO: Reading setti…