testing very large attention windows

microsoft / torchscale

Foundation Architecture for (M)LLMs

https://aka.ms/GeneralAI

MIT License

3k stars 201 forks source link

testing very large attention windows #36

Open fredzannarbor opened 1 year ago

fredzannarbor commented 1 year ago

This is kind of a simple-minded question, but what do I do if I want to see for myself that I can process a huge attention window using torchscale? Ideally, I'd simply like to be able to run a single script or function that shows that, yes, it works, say with summarization of a large corpus of books.