abertsch72 / unlimiformer

Public repo for the NeurIPS 2023 paper "Unlimiformer: Long-Range Transformers with Unlimited Length Input"
MIT License
1.05k stars 77 forks source link

Set max_size to 128 but use 512 tokens #43

Closed adivoj closed 11 months ago

adivoj commented 11 months ago

Hi, great work I must say!

I understand that books can be fed into trainer while having the trainer token max size set to max but is it possible to set it to a lower number? My input is ca. 400 tokens but I'd like to speed up training by shortening it.

Thanks!

urialon commented 11 months ago

Hi @adivoj , Thank you for your interest in our work and for your kind words!

I'm not sure I understand your question.

  1. When we train on books, we train on the first 16,000 tokens, but test on the entire 500,000+ tokens at test time.
  2. If your inputs are 400 tokens long, they can fit in the standard context window of all LMs. Do you need Unlimiformer for them?

Best, Uri

adivoj commented 11 months ago

Hi,

Thanks @urialon, thing is that I didn't quite understand the possibilities. Now It's more clear. I though that you found a way to pack them to a smaller size which would give faster training time. No worries, unlimiformer will certainly come to a good use at some other time.

Best regards, adivoj