-
## ❓ Questions and Help
I have trained my transformer model once on a single GPU and once using a multi-core TPU. In both cases a batchsize of 256 is used (times 8 for the TPU). My training results…
-
I meets the errors when funtune using loar.
ValueError: Target module LlamaDecoderLayer(
(self_attn): LlamaAttention(
(q_proj): Linear(in_features=2048, out_features=2048, bias=False)
(k…
-
```
model = AutoGPTQForCausalLM.from_quantized(
model_name,
#use_triton=True,
#warmup_triton=False,
trainable=True,
inject_fused_attention=False,
…
-
- PyTorch-Forecasting version: '0.10.3'
- PyTorch version: '1.12.0+cu116'
- Python version: 3.9.13
- Operating System: Ubuntu 20.04.4 LTS
I wanted to resume TFT training from the last checkpoi…
-
Hi, I am trying to train the model `llama-7b-hf` with single GPU. I tried to reduce some parameters but I don't know if they are better.
Components of my pc :
- AMD Ryzen 5 7600 6-Core Processor…
-
I have to calculate the ETA for finishing training often enough that I think it should be a feature.
How about we log the ETA along `elapsed time per iteration`?
This is just current `elapsed_ti…
-
Hi! I downloaded the SHP dataset and was trying to run the actor training. I ran into several issues here with vanilla python, torchrun, and deepspeed.
-
When using multi-GPU via the `accelerate` scripts, performance is improved, however when doing multi-node-multi-GPU performance degrades below usability.
Benchmarks:
1. Single P4 GPU: 1.8 it/sec…
-
Comfy supports directML for AMD cards out of box so there are AMD users.
Granted DirectML is terrible at dealing with memory management internally so I doubt this would work on their consumer car…
-
I am new to pytorch and distributed learning in general and I’m trying to go through this tutorial here: https://pytorch.org/tutorials/beginner/aws_distributed_training_tutorial.html. After setting ev…