This is kind of a simple-minded question, but what do I do if I want to see for myself that I can process a huge attention window using torchscale? Ideally, I'd simply like to be able to run a single script or function that shows that, yes, it works, say with summarization of a large corpus of books.
This is kind of a simple-minded question, but what do I do if I want to see for myself that I can process a huge attention window using torchscale? Ideally, I'd simply like to be able to run a single script or function that shows that, yes, it works, say with summarization of a large corpus of books.