microsoft / torchscale

Foundation Architecture for (M)LLMs
https://aka.ms/GeneralAI
MIT License
2.98k stars 201 forks source link

Query about Retentive Network's Recurrent Representation #76

Closed gopi-erabati closed 9 months ago

gopi-erabati commented 9 months ago

Is the recurrent representation of retentive network only valid for causal nature of input as such in language tasks ? If the input nature is without causal property, will the recurrent representation hold true ?

donglixp commented 9 months ago

We can also apply autoregressive modeling for image data, such as iGPT and DALL-E v1, although the nature of images is not causal.