BlinkDL / RWKV-LM

RWKV is an RNN with transformer-level LLM performance. It can be directly trained like a GPT (parallelizable). So it's combining the best of RNN and transformer - great performance, fast inference, saves VRAM, fast training, "infinite" ctx_len, and free sentence embedding.
Apache License 2.0
12.48k stars 848 forks source link

模型微调 #163

Closed TJSL0715 closed 1 year ago

TJSL0715 commented 1 year ago

world模型能不能继续微调? 对world模型进行微调时出现--lora:not found 脚本如下: python RWKV-v4neo/train.py \ --load_model RWKV-v4neo/model/RWKV-4-World-CHNtuned-7B-v1-OnlyForTest_86%_trained-20230708-ctx4096.pth \ --proj_dir output \ --data_file RWKV-v4neo/data \ --data_type binidx \ --vocab_size 50277 \ --ctx_len 1024 \ --epoch_steps 1000 \ --epoch_count 1000 \ --epoch_begin 0 \ --epoch_save 5 \ --micro_bsz 2 \ --accumulate_grad_batches 4 \ --n_layer 24 \ --n_embd 1024 \ --pre_ffn 0 \ --head_qk 0 \ --lr_init 1e-4 \ --lr_final 1e-4 \ --warmup_steps 0 \ --beta1 0.9\ --beta2 0.999 \ --adam_eps 1e-8 \ --accelerator gpu \ --devices 1 \ --precision bf16 \ --strategy deepspeed_stage_2 \ --grad_cp 1 \ --lora \ --lora_r 8 \ --lora_alpha 16 \ --lora_dropout 0.01 \ --lora_parts att,ffn,time,ln