Open James4Ever0 opened 1 year ago
Maybe I didn't got your thoughts, but in train.py, the default config is creating new RWKV model rather than loading from existing model.
To training using your own dataset, You can just start with here
python train.py --load_model "" --wandb "" --proj_dir "out" \
--data_file "./enwik8" --data_type "utf-8" --vocab_size 0 \
--ctx_len 512 --epoch_steps 5000 --epoch_count 500 --epoch_begin 0 --epoch_save 5 \
--micro_bsz 12 --n_layer 6 --n_embd 512 --pre_ffn 0 --head_qk 0 \
--lr_init 8e-4 --lr_final 1e-5 --warmup_steps 0 --beta1 0.9 --beta2 0.99 --adam_eps 1e-8 \
--accelerator gpu --devices 1 --precision bf16 --strategy ddp_find_unused_parameters_false --grad_cp 0
I saw some code under RWKV-LM/RWKV-v4neo/src/model.py which requires CUDA to create RWKV model.
I want to change the code by replacing the first embedding layer with a linear layer to fit my needs.
The code of
rwkv.model.RWKV
only allows me to load from existing model weights.I want to know where or how I can create a new RWKV model from config, not from existing model weights, also how do I change the first layer of the model?