kyegomez / LongNet

Implementation of plug in and play Attention from "LongNet: Scaling Transformers to 1,000,000,000 Tokens"
https://discord.gg/qUtxnK2NMf
Apache License 2.0
663 stars 63 forks source link

LongNet can be used for fine-tuning large language models? #17

Closed mahuixian closed 8 months ago

mahuixian commented 10 months ago

If I want to use LongNet to fine-tune the already trained large language model, can it be implemented?

Upvote & Fund

Fund with Polar

kyegomez commented 8 months ago

@mahuixian Add the masking strategy into the existing model's attention!