Fix lora training for time-related parameters (Reopened PR)

Issue

The issue with the original code is that it only checks whether a module name contains ".time_" when the enable_time_finetune option is enabled. However, in RWKV-v4neo, the only module that contains ".time_" in its name is time_shift in RWKV_TimeMix, which is not trainable because it is a nn.ZeroPad2d.

As a result, time-related trainable nn.Parameters such as time_first in RWKV_TimeMix and time_mix_k in RWKV_ChannelMix will never be set to requires_grad = True since they are not modules.

Fix

To address this issue, this pull request takes into account parameter names in module.named_parameters() when the enable_time_finetune option is enabled. This ensures that gradients are enabled for time-related parameters.

Blealtan / RWKV-LM-LoRA

Fix lora training for time-related parameters (Reopened PR) #9

Issue

Fix