-
### System Info
torch 2.4.1
transformers 4.46.0.dev0
trl 0.11.2
peft 0.13.1
GPU V100
CUDA …
-
Hi authors,
I am fine-tuning the cogvideo-2b model with LoRA. I have added a new loss function with a small weight to the original diffusion loss. Initially, the training seems to work fine, but af…
-
Hi,
After increasing the max_steps=100, in the qLora_finetuning_cpu.py code , my system crashes.
My system configuration:
Xeon Gold
Memory: 128 GB
Disc Capacity : 3.8 TB
OS: 22.04.3 LTS
…
-
Hi,
I am using a 200K utterance to train LDA. While training LDA CPU RAM getting full and the process was killed. My CPU RAM is 8GB & 2GB swap memory. How to train LDA with a large amount of data?
-
Thanks to the author for sharing his code, I have some questions when I run the author's code, I hope I can get the answer from the author.
I'm training without the GPU, using my computer's CPU to tr…
-
**Description**
I created a small neural network comparing both accelerate and hmatrix to perform the matrix calculations, and trained it for 100 epochs (iterations), but found that it took several s…
-
**Command: tune run lora_finetune_single_device --config llama3_1/8B_lora_single_device**
**Output**:
```
INFO:torchtune.utils._logging:Running LoRAFinetuneRecipeSingleDevice with resolved config:…
-
### 🐛 Describe the bug
Hi, it looks like compiling model in `inference_mode` can break subsequent compilations of the same model in training mode.
Here is an example:
```python
import torch
…
-
The nnabla release of open-unmix is not feature complete with respect to our pytorch reference. The following issues would need to be solved
* [X] Dataset parameters
* [ ] Training parameters
* […
-
After the BytePS benchmark I found that asynchronous training was slower than synchronous training:
https://github.com/bytedance/byteps/blob/master/docs/step-by-step-tutorial.md
The asynchronous t…