lucidrains / perceiver-ar-pytorch

Implementation of Perceiver AR, Deepmind's new long-context attention network based on Perceiver architecture, in Pytorch
MIT License
86 stars 4 forks source link

RoPE + Learnable PE vs RoPE only #8

Closed tomasff closed 1 year ago

tomasff commented 1 year ago

Thank you for providing this torch implementation of the Perceiver AR.

I was wondering if the following should be controlled by a flag? In the original proposal it seems that the PerceiverAR only relies on the rotary positional encodings, rather than a combination of RoPE + learnable positional embedding.

https://github.com/lucidrains/perceiver-ar-pytorch/blob/685d77d152c55ef7210336566b952de7da631f68/perceiver_ar_pytorch/perceiver_ar_pytorch.py#L276

I'd be happy to submit a PR if that's the case.

Thanks!