-
Facing an issue while tuning LLAMA-2-7b-chat on which I request some suggestions.
1. I use a specific system prompt that defines some keys, and then provide an instruction and ask the model to gene…
-
### System Info
- `transformers` version: 4.40.0.dev0
- Platform: Linux-5.15.0-101-generic-x86_64-with-glibc2.17
- Python version: 3.8.2
- Huggingface_hub version: 0.20.2
- Safetensors version: 0…
-
We would like to evaluate the model performance for various LLM fine tuning approaches and compare them with the standard benchmarks. An experiment we would like to try is:
- **Compare the full car…
-
Currently, InstructLab does not publish any metrics per taxonomy leaf node. We would like to explore different ways we can evaluate the InstructLab model being fine tuned via the taxonomy approach and…
-
如题,支持QLora吗?
-
-
I've been trying to make the combination `deepspeed + qlora + falcon` work but due to unknown reasons I've stuck in an error maze.
## Setup
- Docker image: `winglian/axolotl-runpod:main-py3.9-cu…
-
Hello,
Wanted to quantize the model via awq after a merged qlora b&b nf4 mixtral moe.
the error is:
```
self._search_best_scale(self.modules[i], **layer)
File "/home/access/anaconda3/envs/s…
-
It would be brilliant if we could get implemented fine-tuning methods for robust adaptation given how much better it is than LoRA and QLoRA methods.
-
因为`bitsandbytes`实现模型量化的时候是通过重载`.cuda()`函数实现的,也就是说模型在放到显卡的时候会发生量化(改变tensor维度)。在微调的时候,加载的预训练权重是fp16的,所以需要设置`args.device='cpu'`,把权重加载进来再调用`.cuda()`。因为这个是`bitsandbytes`的实现,我们也没办法控制,只能适配。
…