Open ravidborse opened 1 year ago
Actually its stuck at Torch Autograd Backward
Running training
Num examples = 7000
Num Epochs = 3072
Instantaneous batch size per device = 5
Total train batch size (w. parallel, distributed & accumulation) = 2050
Gradient Accumulation steps = 410
Total optimization steps = 9216
0%| | 0/9216 [00:00<?, ?it/s]^CTraceback (most recent call last):
File "seq2seq/run_seq2seq.py", line 271, in
From last one hour it's stuck at
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 402.99it/s] 08/28/2023 18:58:48 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /transformers_cache/spider/spider/1.0.0/df8615a31625b12f701e3840f2502d74f4b533dc60aa364a1f48cfd198acc326/cache-7e03875afb379451.arrow 08/28/2023 18:58:48 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /transformers_cache/spider/spider/1.0.0/df8615a31625b12f701e3840f2502d74f4b533dc60aa364a1f48cfd198acc326/cache-06decf315ea7a716.arrow 08/28/2023 18:58:49 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /transformers_cache/spider/spider/1.0.0/df8615a31625b12f701e3840f2502d74f4b533dc60aa364a1f48cfd198acc326/cache-6ef067fed50d786a.arrow 08/28/2023 18:58:49 - WARNING - datasets.arrow_dataset - Loading cached processed dataset at /transformers_cache/spider/spider/1.0.0/df8615a31625b12f701e3840f2502d74f4b533dc60aa364a1f48cfd198acc326/cache-e3414ffb7b73b322.arrow 08/28/2023 18:58:51 - WARNING - seq2seq.utils.dataset_loader - The split
train
of the datasetspider
contains 8 duplicates out of 7000 examples Running training Num examples = 7000 Num Epochs = 3072 Instantaneous batch size per device = 5 Total train batch size (w. parallel, distributed & accumulation) = 2050 Gradient Accumulation steps = 410 Total optimization steps = 9216 0%| | 0/9216 [00:00<?, ?it/s]