speculative-decoding Search Results

1000+ results
for speculative-decoding

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

vllm-project/vllm #5736

[Bug]: which torchvision version required

/vllm_2$ python examples/phi3v_example.py WARNING 06-21 14:53:06 ray_utils.py:46] Failed to import Ray with ModuleNotFoundError("No module named 'ray'"). For multi-node inference, please install Ray …

tusharraskar updated 4 months ago
12
bentoml/OpenLLM #1009

bug: Cannot Run an OpenLLM server regardless of where I try …

### Describe the bug I recently tried using openllm to connect to llama and it would give me some bentoml config errors. I'm not sure if its because I don't have a GPU but I didn't see any evidence o…

Said-Ikki updated 4 months ago
6
johnfanv2/LenovoLegionLinux #76

Legion Slim 5 16APH8 Support

Model name: Lenovo Legion Slim 5 16APH8 CPU model: AMD Ryzen 7 7840HS GPU model: NVIDIA RTX 4060 Mobile Keyboard backlight: RGB OS: Archlinux Output of `sudo dmidecode -t system`. Please remove…

brett-owen updated 4 weeks ago
30
vllm-project/vllm #439

Scope for assisted generation?

It seems that assisted generation can further reduce sampling latency. Is there scope for adding support for that in vllm? Assisted generation [docs](https://huggingface.co/blog/assisted-generation…

creatorrr updated 7 months ago
8
vllm-project/vllm #5222

[Usage]: RuntimeError: CUDA error: uncorrectable ECC error e…

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 20.04.6 LTS (x86_64) GCC ve…

DJCoolDev updated 4 months ago
2
Joshua-Riek/ubuntu-rockchip #965

Bug Report: ubuntu-22.04 OrangePi3b File Missing rk3566-oran…

### What happened? Cannot Boot the image. Its not related to NVME or SDCARD boot but missing `.dtb` file Process followed: [official](http://www.orangepi.org/orangepiwiki/index.php?title=Orange_Pi_3…

defencedog updated 1 month ago
10
PygmalionAI/aphrodite-engine #522

[Bug]: torch.cuda.OutOfMemoryError: CUDA out of memory. Trie…

### Your current environment ```text PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Ubuntu 22.04.4 LTS (x86_64) GCC ve…

Star-98 updated 3 months ago
11
golang/go #19623

proposal: spec: change int to be arbitrary precision

An idea that has been kicking around for years, but never written down: The current definition of `int` (and correspondingly `uint`) is that it is either 32 or 64 bits. This causes a variety of pro…

robpike updated 2 months ago
215
vllm-project/vllm #8276

[Usage]: multi image inference for "OpenGVLab/InternVL2-8B" …

multi image inference for "OpenGVLab/InternVL2-8B" not working I got this inference code from here https://github.com/vllm-project/vllm/blob/main/examples/offline_inference_vision_language_multi_…

dahwin updated 1 month ago
6
ml-explore/mlx-examples #859

Model type deepseek_v2 not supported.

I tried using the `deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct` model and ran into this error: `ValueError: Model type deepseek_v2 not supported.` Any plans to support `deepseek_v2` soon?

NotYourName24 updated 3 months ago
18

上一页 1...89 90 91 92 93 94 95...100 下一页

1000+ results for speculative-decoding

1000+ results
for speculative-decoding