OpenNLPLab / lightning-attention

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
MIT License
182 stars 15 forks source link

The methods for saving the Lightning-Attention model #10

Closed wsleepybear closed 6 months ago

wsleepybear commented 6 months ago

I am using torch.save to save a model that contains the lightning_attn_func function. It seems that the model does not save the parameters related to lightning_attn_func. When I reload the model, I find that the results are inconsistent with those during the training process. Is this because the save method is incorrect, or is it for some reason that the save cannot be performed?

Doraemonzzz commented 6 months ago

Hello, do you solve it? lightning_attn_func is a pure function and does not have any paramters.