-
qlora请问是对fp16微调呢还是int4微调呢?
为什么我跑出来的结果需要加fp16原始的模型参数呢,这样就很大了
-
I’m working on a multi task classification with DistilBert with 4 labels, based on your repo, and I was wondering if maybe you could help me, since I'm having a hard time trying tor each the hugging f…
-
The _local.get_async_ function only uses the submit function even when the actual scheduler used is a _concurrent.futures.Executor_ and _chunksize > 1_ or a call to the _map_ method would be more effi…
-
**Describe the bug**
I'm am tring to train a model using SSL, pretrain and finetune as described here :
https://pytorch-tabular.readthedocs.io/en/latest/tutorials/08-Self-Supervised%20Learning-DAE/…
-
In trainer.py (line 101) there is a missing argument. Here's how I define optimiser and lr_scheduler in train.py:
```
optimizer = torch.optim.Adam(params, lr=0.001, weight_decay=0.0001)
scheduler…
-
**Describe the bug**
I am trying to fine-tune DeepSeek-Coder-V2-Lite-Instruct (16B) on a system with 8 MI300X GPUs. Running on any number of GPUs less than 8 works as expected and runs to completion. …
-
I'm trying to train the stable-diffusion-2-1-unclip model with dreambooth accelerate with following commands.
accelerate launch train_dreambooth.py \
--pretrained_model_name_or_path="model/st…
todhm updated
5 months ago
-
请教个问题,如果我训练每次都做psi感觉没太大必要,有时候我希望psi做完一次后,就专注做模型的训练。但是我看所有的例子好像都需要做一次psi,然后进行纵向训练,https://github.com/FederatedAI/FATE/blob/d9253c4dedd4799b3d68de5c63cd261fbb3af033/examples/pipeline/coordinated_lr/test…
-
在训练的时候,从Dataloader中拿数据说没有batch_size,我检查了好几遍是有输入这个参数的。结果发现
(
model,
optimizer,
train_dataloader,
eval_dataloader,
lr_scheduler,
) = accelerator.prepar…
-
Currently there are only two terminal states for batch jobs, "finished" and "submit_failed". We would like more states that capture whether the batch job was deleted or terminated by the user/schedul…