-
in windows
replacing private.dockerfile with:
```
FROM hyperonym/basaran:0.15.3
# Copy model files
COPY vicuna128 D:\basaran\models\vicuna128
# Provide default environment variables
ENV MO…
-
**我试着把finetune.py中user_lora设置为true。运行trainer.train()时抛出Expected a cuda device, but got: cpu。详细日志**
Traceback (most recent call last):
File "/data/AI/GrammarGPT/GrammarGPT-main/finetune.py", line 3…
-
### System Info
```shell
optimum-habana 1.14.0.dev0
HL-SMI Version: hl-1.18.0-fw-53.1.1.1
Driver Version: 1.18.0-ee698fb
```
### Information
- [X] The off…
-
I am using multi-gpu to quantize the model and inference with deepspeed==0.9.0, but failed.
Device: RTX-3090 x 8 Server
Docker: [nvidia-pytorch-container](https://catalog.ngc.nvidia.com/orgs/nvidi…
-
I follow the readme :
## Build model with both INT8 weight-only and INT8 KV cache enabled
python convert_checkpoint.py --model_dir ./bloom/560m/ \
--dtype float16 \
…
-
**LocalAI version:**
#895
**Environment, CPU architecture, OS, and Version:**
sh-5.2$ uname -a
MSYS_NT-10.0-19045 DESKTOP-S7HQITA 3.4.7-ea781829.x86_64 2023-07-05 12:05 UTC x86_64 Msys
…
-
https://github.com/THUDM/ChatGLM-6B
int4 quant
https://arxiv.org/abs/2301.12017
百万级别语言模型
https://github.com/LianjiaTech/BELLE
https://zhuanlan.zhihu.com/p/616151762
2023/3/26 更新-- ChatGLM…
-
Currently, you can use the model either directly, via [this notebook](https://colab.research.google.com/drive/1DqaqGkI6ab92WeXDGMStWt_l8BPHUaIj#scrollTo=OJ7AFOumIYlY), or through http://chat.petals.ml…
-
-
### System Info
GPU A100
TRT-LLM 0.8.0.dev2024013000
### Who can help?
@Tracin
### Information
- [X] The official example scripts
- [ ] My own modified scripts
### Tasks
- [X] An officially su…