Jamie-Stirling / RetNet

An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
MIT License
1.14k stars 99 forks source link

demo example / number of parameter control vs original code #14

Open thegodone opened 11 months ago

thegodone commented 11 months ago

Excellent, Can you write a demo example on training ? Comparing to the Microsoft code have you check that it provide same number of parameters for the same settings ?

Jamie-Stirling commented 11 months ago

This is a good idea, thank you.

I'll look into this when I get time, hopefully tomorrow.

thegodone commented 11 months ago

Great, maybe you can reused this example on harry potter: https://github.com/DonRL10/RetNet/blob/main/char.ipynb

Jamie-Stirling commented 11 months ago

There's now a file example.py which outputs the number of parameters when hyperparameters are equal to those for the 1.3B model in the original paper.

thegodone commented 11 months ago

Great thanks a lot.

Could you also provide similar harry potter demo see my link https://github.com/DonRL10/RetNet/blob/main/char.ipynb ?