gradient-accumulation Search Results

1000+ results
for gradient-accumulation

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

unslothai/unsloth #175

AutoModelForSequenceClassification

Hi, I'm training a model (essentially copied from https://huggingface.co/blog/unsloth-trl#unsloth--trl-integration): ```python import torch from trl import SFTTrainer from transformers import Tr…

asmith26 updated 1 week ago
7
PaddlePaddle/PaddleMIX #659

在NPU上进行LLaVA1.6微调训练时遇到问题

Paddle版本： ``` python -c "import paddle; print(paddle.version.commit)" CustomDevice: npu, visible devices count: 2 2ef8abae65f11fa3cdae784b4ac58750e0fa3bbb ``` CANN版本：`8.0.RC1` 操作系统版本：`Ubun…

yimuu updated 3 months ago
2
Jiahao000/MFM #1

i only have 1 gpu , how can i run (bash dist_finetune.sh ...…

thank you very much for your MFM !!! when i run (bash dist_finetune.sh ...) , get error how can i run (bash dist_finetune.sh ...) with only 1 gpu , not multi gpu ? ``` /opt/conda/envs/py3.9_cu…

565ee updated 1 year ago
1
wangyuxinwhy/uniem #89

单机多卡运行时报错 has parameters that were not used in producing los…

### 🐛 bug 说明 **使用指令** CUDA_VISIBLE_DEVICES=2,3 accelerate launch --num_processes 2 path_to_train_m3e.py path_to_model path_to_dataset \ --output-dir output_dir **报错信息** …

whi497 updated 4 months ago
6
modelscope/ms-swift #2290

自己进行lora训练后的模型权重，在swift中如何加载、合并、导出？

**Describe the feature** 没有使用swift进行微调，自己进行lora训练后的模型权重如何在swift中如何加载、合并、导出？ **Paste any useful information** 参考[Qwen2.5-7B-Instruct Lora 微调](https://github.com/datawhalechina/self-llm/blob/master…

qingchen177 updated 1 week ago
1
huggingface/diffusers #9546

Flux Controlnet Train Example, will run out of memory on val…

### Describe the bug On default settings provided in flux train example readme, with 10 validation images training will error out with out of memory error during validation. on A100 80GB ``` …

Night1099 updated 3 weeks ago
14
hiyouga/LLaMA-Factory #4608

fsdp + DPO + fullyfintune会报错

### Reminder - [X] I have read the README and searched the existing issues. ### System Info pass ### Reproduction ``` CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7" accelerate launch \ --config_fil…

qy1026 updated 3 months ago
3
yxli2123/LoftQ #27

Error with shape

RuntimeError: Error(s) in loading state_dict for PeftModelForCausalLM: size mismatch for base_model.model.model.embed_tokens.weight: copying a param with shape torch.Size([32001, 4096]) from …

manlenzzz updated 6 months ago
2
kiddyboots216/lottery-ticket-adaptation #2

About mask generating and adaptation

Hello, Ashwinee Panda I was very impressed with your work and wanted to thank you for the excellent contribution. I am currently following the tutorial using the openbookqa task to finally experime…

HeeseongEom updated 2 months ago
2
eosphoros-ai/DB-GPT-Hub #269

预测阶段：poetry run sh ./dbgpt_hub/scripts/predict_sft.sh，Killed

- 环境： - WLS-2，Ubuntu22.04， 4090 GPU x1 - train_sft.sh ```bash CUDA_VISIBLE_DEVICES=0 python dbgpt_hub/train/sft_train.py \ --model_name_or_path $model_name_or_path \ --quantizati…

GuokaiLiu updated 3 weeks ago
2

上一页 1...77 78 79 80 81 82 83...100 下一页

1000+ results for gradient-accumulation

1000+ results
for gradient-accumulation