-
### System Info
```
Transformers 4.41.2
peft 0.11.1
```
Single `T4` GPU.
I am implementing QLoRA for fine-tuning a `mistral-7b` on a `T4` gpu. I loaded the model with the quantized confi…
-
Feel free to post useful resources, suggestions or show off your projects here. Older comments of this nature can be found in #21 .
-
### Checklist
- [x] 1. I have searched related issues but cannot get the expected help.
- [ ] 2. The bug has not been fixed in the latest version.
- [ ] 3. Please note that if the bug-related issue y…
-
## 🐛 Bug
I ran `mlc_llm chat HF://mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC` and it failed with `ValueError: Cannot find global var "multinomial_from_uniform1" in the Module`
## To Reproduce
S…
-
When I fine tuning llama2 with deepspeed and qlora on one node and multi GPUs, I used zero3 to partition the model paramters, but it always first load the whole params on each GPU and partition params…
-
Hi team,
I an trying to deploy my model on AMD NPU device using VitisAIExecutionProvider. I thought that all supported operators can be computed on NPU, but often I encounter this notice:
`I202405…
qz233 updated
5 months ago
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
### Describe the bug
w4a16量化和w8a8量化InternVL-Chat-V…
-
### Checklist
- [X] 1. I have searched related issues but cannot get the expected help.
- [X] 2. The bug has not been fixed in the latest version.
- [x] 3. Please note that if the bug-related iss…
-
`
root@dsw-541920-5fd5c64bc4-m25b4:/mnt/workspace/modelscope# xtuner train llama2_7b_chat_qlora_custom_sft_e1_copy.py --deepspeed deepspeed_zero1
[2024-07-01 21:43:15,368] [INFO] [real_accelerator.p…
-
## Overview
Recently, the mlc-llm team has been working on migrating to a new model compilation workflow, which we refer to as SLM. SLM is the new approach to bring modularized python first compila…