abertsch72 / unlimiformer

Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
MIT License
1.05k stars 77 forks source link

Can unlimiformer work with common fine-tuning methods? #23

Open mrlzh opened 1 year ago

mrlzh commented 1 year ago

I set max_source_length with 10240 and training t5 models,but it ran out of CUDA memory. I would like to know if unlimiformer can run together with fine-tuning methods such as LoRA.

abertsch72 commented 11 months ago

Hi @mrlzh , thanks for your interest!

We haven't tried using Unlimiformer with LoRA, but there isn't a theoretical reason that they wouldn't work together. If you try it, please let us know how it goes!