Closed tiendung closed 1 year ago
I suspect the original RWKV also needs it but have no direct evidence. I only tried turning LoRA off and checkpointing on in this repo and it doesn't work. However this repo with LoRA off should be identical to the original, so that's what I thought. Maybe later look further into it.
Nice find out, thank you for implementing LoRA for RWKV :)
https://github.com/Blealtan/RWKV-LM-LoRA/blob/df5689bc88fc2f3334fbbc0117369817b0558b2b/RWKV-v4neo/train.py#L260
Studying RWKV-LoRA I found out that
if args.grad_cp == 1
thenRWKV_JIT_ON
should be set to0
. I would like to ask if it is applicable for LoRA only or original RWKV also needed it?Thanks.