Initial effort to add chunkwise retention paradigm

Thanks for this.

If possible, please could you modify your changes to be in line with the rest of the code?

The majority of the code of the other two paradigms is in retention.py, with retnet.py making reference to them. I would suggest moving the Chunk-wise in line with these two.

I'd also recommend adding a test to make sure they give the same output as the other two paradigms (see files prefixed with "test_").

If this is implemented I'll be able to merge.

Jamie-Stirling / RetNet

Initial effort to add chunkwise retention paradigm #3