-
### System Info
It's using the versions downloaded by pip install during the llama stack build.
I have an nvidia GPU
### Information
- [X] The official example scripts
- [ ] My own modified…
-
Thank you for the great work and the pre-print! I have a question in running the code. I would appreciate if you could answer it.
As for installation, I followed the standard steps as in,
```
doc…
-
## 🐛 Bug
## ❓ General Questions
Based on https://llm.mlc.ai/docs/deploy/rest.html#id5, we can use more than 1 additional models as we use speculative decoding mode.
But when get response via re…
-
I have some questions about the structure of custom mask for lookahead and verify branches [as described in the blog](https://lmsys.org/blog/2023-11-21-lookahead-decoding/#lookahead-and-verify-in-the…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I install the llamaindex with the command `pip install llama-index` and install t…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
WARNING 09-27 15:24:15 _custom_ops.py:15] Failed to import from vllm._C with Mo…
-
Hello,
I want to express my gratitude for your outstanding work. The powerful lm-evaluation-harness and your continuous maintenance have made LLM-evaluation much more convenient.
However, I hav…
-
### 🚀 The feature, motivation and pitch
[Parallel/Jacobi decoding](https://arxiv.org/abs/2305.10427) improves inference efficiency by breaking the sequential nature of conventional auto-regressive …
-
### Your current environment
Collecting environment information...
PyTorch version: 2.3.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Cen…
-
```python
Running loglikelihood requests: 0%| | 0/18330 [00:00