-
Following [the tutorial](https://github.com/lllyasviel/ControlNet/blob/main/docs/train.md#step-3---what-sd-model-do-you-want-to-control) I can successfully download SD, add ControlNet, and train it.
…
-
**Describe the bug**
When the sequence of calculation parameters (FP16/BF16) in the buffer is different from the forward execution sequence of the model: As a result, when the `--overlap-param-gather…
-
### Expected Behavior
Normally, when I use CUDA on ZLUDA, the prompt should be executed: I am using an AMD Radeon Vega 8 Graphics GPU with the AMD Ryzen 5 3500U CPU. It should happen normally... if i…
-
```python
try:
import transformers
except ImportError:
pass
from ctranslate2.specs import (
transformer_spec,
)
from ctranslate2.converters.transformers import TransformersConver…
-
### Please check that this issue hasn't been reported before.
- [X] I searched previous [Bug Reports](https://github.com/OpenAccess-AI-Collective/axolotl/labels/bug) didn't find any similar reports…
-
### 🐛 Describe the bug
I'm trying to finetune Llama2-7B (to reproduce the experiments in a paper) using PEFT LoRA (0.124% of trainable params). However, this results in an out-of-memory (OOM) error o…
-
### System Info
- `transformers` version: 4.44.2
- Platform: Linux-6.8.0-40-generic-x86_64-with-glibc2.35
- Python version: 3.10.12
- Huggingface_hub version: 0.24.6
- Safetensors version: 0.…
-
**Notes**
- Location: `pipegoose.nn.parallel_mapping.ParallelMapping`
- `module` is an instance in `model.named_modules()`
- model is `AutoModelForCausalLM.from_pretrained()`, `torch.nn.Transformer…
-
Is the frame work support multi-gpu training?
I want to use the frame work to train a 70B model, however, I did not find the parameter settings or methods for multi-gpus training.
-
By saving the model and reloading it I managed to get the model working, both with quantized and full precision (it still uses 10gb max of gpu ram).
However, the model generates random characters. He…