-
### Describe your issue
It is giving a empty page on localhost:3000
### How To Reproduce
Steps to reproduce the behavior (example):
All steps are followed according to https://www.youtube.co…
-
参考 https://soulteary.com/2023/07/23/build-llama2-chinese-large-model-that-can-run-on-cpu.html
使用 Apple M2, 用最后的 docker `soulteary/llama2:runtime` 运行 `Chinese-Llama-2-7b-ggml-q4.bin`
```bash
main:…
-
Using the GenAI stack from Docker and having built my Ollama on **Windows**, I tried to run the stack and I have this message
```
genai-stack-pull-model-1 | pulling ollama model llama2 using http:/…
-
### System Info
using 3090 and the docker image produced by the QuickStart Doc
### Who can help?
_No response_
### Information
- [X] The official example scripts
- [X] My own modified scripts
##…
-
### Class | 类型
大语言模型
### Feature Request | 功能请求
我直接从运行的docker镜像,但是发现chatglm 和llama2的模型都只运行在了一个gpu上,我的机器有4*A5000, 剩下的三个卡都是空着的。所以想问下本项目是否支持多卡并行,好让我把四个卡都用起来或者跑更大的模型。
-
models tried: llama2 and mistral.
Querying anything on the chat terminal, results in an empty response from the AI as shown below.
I have verified all docker instances are up and running.
![image…
-
Hey. Not a computer scientist here, but thought you guys'd like to know that the latest pushed container image is causing issues with gpu inference for me.
System specs
CPU: AMD Ryzen 3600
GPU: I…
-
Hi,
Is it possible to run the docker image on Mac M1 with GPU mode ?
> the docker command I'm using
```bash
export CUDA_VISIBLE_DEVICES=0
docker run \
-p 7860:7860 \
--shm-…
-
**Describe the bug**
When we start `serve_reward_model.py` and run annotation, the server goes down during processing. It will crash on specific samples. These samples have a long context.
[error.lo…
-