-
**Describe the bug**
I was looking for the bloomz models on jumpstart and I noticed the task name for them is oddly `textgeneration1`. Is that on purpose ?
**To reproduce**
Code snippet:
```
…
-
Hello,
I have converted bloomz model successfully, but the inference doesn't work.
```
./main -m ./models/ggml-model-bloomz-f16.bin -t 8 -n 128
main: seed = 1679167152
bloom_model_load: load…
-
Hello,
I have successfully converted the bloomz 176B model to fp16.
However, the quantization doesn't work and throw an error:
```
./quantize ./models/ggml-model-bloomz-f16.bin ./models/ggml-m…
-
Hello, I‘m tring to use BLOOMZ for reward model training, and get error:
```
Traceback (most recent call last):
File "/users5/xydu/ChatGPT/DeepSpeed-Chat/training/step2_reward_model_finetuning/tr…
-
Hi,
Are there any configurations for other models than llama? When I first tried to run the finetune script for `blooomz-7b1` model, I had this error:
`ValueError: Target modules ['q_proj', 'v_pro…
-
I'm facing the above error in both stage 1 and stage 2 when using BLOOMZ 3B and 560M.
I tried adding "model.to(device)" and "model.to('cuda')" to main.py but neither worked.
The error only appears w…
-
CUDA_VISIBLE_DEVICES=0 python /home/ubuntu/TextToSQL/DB-GPT-Hub/src/dbgpt-hub-sql/dbgpt_hub_sql/train/sft_train.py\
--model_name_or_path /home/ubuntu/.cache/modelscope/hub/qwen/Qwen2___5-Coder-7B…
-
BLOOMZ and mT0 models are related, and mT0-13B performs better than BLOOMZ-176B in some cases.
The mT0-13B will be a killer model for normal user devices after a GPTQ-4bit quantization.
Hope the…
-
### 🐛 Describe the bug
INFO colossalai - colossalai - INFO: Tokenizing inputs... This may take some time...
Episode [1/100]: 0%| …
-
### Branch/Tag/Commit
main
### Docker Image Version
nvcr.io/nvidia/pytorch:22.09-py3
### GPU name
V100-32G
### CUDA Driver
11.0
### Reproduced Steps
steps 1: pull images w…