-
### Description
Hello, I am using mlforecast to train a global forecasting model and reaching an exciting performance. However, I have some questions about the training details of global models. Spec…
-
### 🐛 Describe the bug
Related:
* https://github.com/pytorch/pytorch/issues/124289
* https://github.com/pytorch/pytorch/issues/109607
```python
"""Demonstrate torch.compile error on transform…
-
### System Info
peft = 0.13.2
python = 3.12.7
transformers = 4.45.2
### Who can help?
@sayakpaul
I am using ```inject_adapter_model(...)``` to finetune a model from OpenCLIP using LoRA layers…
-
## Description
I'm benchmarking naive FlashAttention in `Jax` vs. the Pallas's version of [`FA3`](https://github.com/jax-ml/jax/blob/7b9914d711593dca8725d46aa1dadb2194284519/jax/experimental/pallas…
-
### Expected Behavior
-
### Actual Behavior
![image](https://github.com/user-attachments/assets/1f9608dc-4631-41c3-bd2a-bfe506d39104)
SD15 and Flux work fine, the problem is only with SDXL
Co…
-
### What happened?
I am using SD15. When the batch size on "Empty Latent Image" is set to 2, I get a CUDA error with `torch.nn.functional.scaled_dot_product_attention`from attention_sharing.py and …
Lia-C updated
1 month ago
-
Dear all,
It would be great to see an end-to-end practical example of LoTR. By "practical" I mean that one takes, for example some existing LLM weights file, compresses it into a smaller weights fi…
-
如题
rknn-toolkit2版本 2.0.0b17 (更高版本转换时会报`invalid tensor malloc size, tensor name: , target: CPU, size: 0`这个错误)
librknnrt.so版本2.2.0
导出onnx:
```python
import torch
from transformers import T…
-
**Describe the bug**
I am using the `train_gpt3_175b_distributed.sh` script to launch training on a single node with 4 A100 80GB GPUs. Training goes well if I use tensor parallel or pipeline parallel,…
-
### Feature request
I want to add the ability to use GGUF BERT models in transformers.
Currently the library does not support this architecture. When I try to load it, I get an error TypeError: Ar…