fkodom / yet-another-retnet

A simple but robust PyTorch implementation of RetNet from "Retentive Network: A Successor to Transformer for Large Language Models" (https://arxiv.org/pdf/2307.08621.pdf)
MIT License
101 stars 15 forks source link

About activation function #6

Closed Dongyeongkim closed 1 year ago

Dongyeongkim commented 1 year ago

https://github.com/fkodom/yet-another-retnet/blob/ee3979c7535b9f79a3020cb098d6b97f143bcd22/yet_another_retnet/retention.py#L16

I think this line should be F.silu rather than F.relu.

Thanks for reading.

Reference

fkodom commented 1 year ago

@Dongyeongkim Definitely -- thanks for catching this. Feel free to open a PR, otherwise I will patch it this morning. 👍

Dongyeongkim commented 1 year ago

I have opened the pr.

Thanks for fast reply