-
We noticed that lm_eval --model vllm did not work when data_parallel_size > 1 and got `Error: No available node types can fulfill resource request` from Ray. After some research, I believe when `tenso…
-
I was experimenting with the graphormer model, specifically for graph classification using the virtual node for global pooling (`graph_pooling: graph_token`).
## Problem
I noticed that the mode…
-
### System Info
```shell
Collecting environment information...
WARNING 11-10 14:19:08 _custom_ops.py:14] Failed to import from vllm._C with ImportError('/mnt/bbuf/vllm-backup/vllm/_C.abi3.so: undef…
-
**Describe the bug**
When using `hivemind.moe.Server` to host experts in a background thread, bootstrapping will fail over and over again, repeatedly - leading to a complete deadlock. I am forced to …
-
From conda-docs created by [davidmakovoz](https://github.com/davidmakovoz): conda/conda-docs#677
I installed the latest version of Anaconda Distribution Anaconda3-2019.03-Windows-x86_64.exe today o…
-
### 🐛 Describe the bug
# Bug program
```
import torch
# CPU
input = torch.randn(3, 4, 5)
running_mean = torch.randn(4)
output, save_mean, save_var, reserve, _ = torch._batch_norm_impl_index…
-
**Tasks**
- [x] Virtualization (It was done by Yvonnick and Hugo)
- [ ] Pagination is to be done (it was not done until now to easily find the index given its id/url)
- [ ] Use [offset pagination…
-
### 🐛 Describe the bug
`torch.linalg.eigh` crashes if the model is compiled into an AOTInductor model and used from the C++ side. The example python code is attached as follows:
```python
import os…
-
Large uploads cause server to freeze, perhaps due to read-lock issues.
## Current Behavior
When uploading large collections using the [python qdrant client](https://github.com/qdrant/qdrant-cl…
-
### 🐛 Describe the bug
I have a two layer network. The input is a 2D array of token ids, first layer is an embedding layer that replaces each pixel the respective embedding, the second layer does a c…