-
### 🐛 Describe the bug
Related:
* https://github.com/pytorch/pytorch/issues/124289
* https://github.com/pytorch/pytorch/issues/109607
```python
"""Demonstrate torch.compile error on transform…
-
### System Info
transformers commit: 52ea4aa589324bae43dfb1b6db70335da7b68654 (main at time of writing)
the rest isn't relevant.
### Who can help?
trainer: @muellerzr @SunMarc
### Informa…
-
### Feature request
Using (https://pytorch.org/blog/flexattention/) Flex-attention (and [Paged attention](https://github.com/pytorch/pytorch/pull/121845/files)) to speedup transformers models and p…
-
### 🐛 Describe the bug
Instead of only patching the transformers mllama module (`transformers.models.mllama.modeling_mllama`), `apply_liger_kernel_to_mllama` modifies `torch.nn.LayerNorm` globally.
…
-
### Your current environment
```text
$ python collect_env.py
Collecting environment information...
PyTorch version: 2.5.1
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to bu…
-
## lora微调qwen2.5-7b逐渐爆显存
- 版本
torch 2.4.0
transformers 4.46.1
ms-swift 2.5.1.post1
- 报错
torch.OutOfMemoryError: CU…
-
Hi,
Today, when I was running LoRA training for the `Flux.1` model (sd-scripts on SD3's breach), the "`train_blocks must be single for split mode`" error suddenly occurred. This error had not appea…
-
Hi,
On DEV branch, deploying the app locally take a lot of time in particular the backend.
Requirement.txt file from the backend contains packages (sentence-transformers, effdet) that have heavy d…
-
**server:** inf2.8xlarge
**vllm version**: 0.6.3.post2.dev77+g2394962d.neuron215
_Desctiption_
Hellow! I am trying to run the code below (the code was taken [here](https://docs.vllm.ai/en/v0.4.1/…
-
Hello,
I successfully downloaded the model to this directory /root/.llama/checkpoints/Llama3.2-1B-Instruct
When I launch the AutoModelForCausalLM.from_pretrained passing the path above I got the f…