-
I want to train the musicgen model (instead musicgen melody model) for Audio-Prompted audio continuation/generation tasks. According to my interpretation of the code provided below, it appears that `…
-
1.
utils.py", line 150, nbatch = data.size(0) // bsz
AttributeError: 'DataLoader' object has no attribute 'size'
-> nbatch = len (data.size)
2.
image.py line 9,
if args == 'cifar' -> if args …
-
When I train the model with custom labels, the training code works well. However, Adapting Inference.py code to my custom trained model does not work.
I change the Inference.ipynb code to adapt my …
-
Could I kindly inquire as to why, given the relatively small size of the tinyllama model, the Strategy was made to utilize FSDP (Fully Sharded Data Parallel) instead of DDP (Distributed Data Parallel)…
-
https://github.com/pytorch/pytorch/blob/4cb534f92ef6f5b2ec99109b0329f93a859ae831/aten/src/ATen/native/cpu/IndexKernel.cpp#L430
OR
https://github.com/pytorch/pytorch/blob/d68ad3cb1e28fc464dd40dd…
-
I tested latency of QuantLinear forward with various sizes of input and feature sizes.
But for token counts from 1 to 1024, I cannot see any speedup compared to AWQ W4A16 kernel and the results were …
-
Traceback (most recent call last):
File "finetune_moss.py", line 303, in
train(args)
File "finetune_moss.py", line 175, in train
accelerator.state.deepspeed_plugin.deepspeed_config['t…
-
bsz = inputs[0].size(self.dim)
IndexError: tuple index out of range
原版是这样写的:
model = DataParallel(model, device_ids=[int(i) for i in args.device.split(',')])
按这个版本的介绍这样写:
model = BalancedDataPara…
-
Hello! I've found a performance issue in baselines/models/xlnet/data_utils.py: `dataset = dataset.batch(bsz_per_core, drop_remainder=True)`[(here)](https://github.com/CLUEbenchmark/CLUE/blob/5b1d19dc8…
-
在我自己的代码中引入 emo, 使用的 bf16 训练, 训练过程中 loss 会变负, 可能是什么原因呢
loss: -2.334918e-02, loss_cur_dp: -2.334918e-02