-
windows环境启动qwen2-instruct报错KeyError,
环境是win10,python3.11.9
qwen2-instruct启动参数是Transformers+pytorch+model size 72+quantization 8-bit。
报错详细信息如下:
2024-06-28 15:39:55,950 xinference.api.restful_api …
-
As part of the Llama 3.1 release, Meta is releasing an RFC for ‘Llama Stack’, a comprehensive set of interfaces / API for ML developers building on top of Llama foundation models. We are looking for f…
-
Due to the overwhelming number of published research papers, the list has become somewhat disorganized. As categories expand and mature, there's a clear need for more fine-grained organization. This d…
-
### Feature request
https://github.com/FasterDecoding/SnapKV
### Motivation
SnapKV: Cache compression technique for faster LLM generation with less compute and memory
In a recent paper, authors …
icyxp updated
4 months ago
-
### The Feature
Similar to #3958 / #3533, LiteLLM _might_ get a performance boost by disabling gzip on upstream LLM requests.
See https://github.com/encode/httpx/discussions/2220#discussion-406389…
-
First of all, thank you for your hard work in developing this evaluation method!
We have now added support for the llm-compression dataset and Bits per Character calculation at OpenCompass. OpenCom…
-
### Your current environment
I have two machine 2*4090, I wanted to runner a model (eg gpt-neox-20b) using vllm on ray cluster, so i follow the documentation by making ray cluster
on head
ray star…
-
### What is the issue?
ollama build fails on undefined llama references
```
# github.com/ollama/ollama
/usr/local/go/pkg/tool/linux_s390x/link: running gcc failed: exit status 1
/usr/bin/ld: /t…
woale updated
1 month ago
-
While we try to compile Llama3 model (int8 and IR version) using Openvino Compile method, we end up with following error:
**RuntimeError: Exception from src\inference\src\cpp\core.cpp:109:
Excepti…
-
When performing `llm_on_ray-finetune --config_file llm_on_ray/finetune/finetune.yaml` to finetune, the following error occurs:
```
View detailed results here: /home/work/ray_results/TorchTrainer_20…