Jamie-Stirling / RetNet

An implementation of "Retentive Network: A Successor to Transformer for Large Language Models"
MIT License
1.14k stars 99 forks source link

Fixed typo #20

Open EgoVeroConsisto opened 10 months ago

EgoVeroConsisto commented 10 months ago

Title: Fix Comment Regarding Potential NaN in SimpleRetention

Description:

Hello,

While going through the SimpleRetention class in the module, I noticed a comment in the _get_D method that seemed to have a minor inconsistency. The comment in question mentions:

# this results in some NaN when n is much larger than m

However, after analyzing the matrix operations, it appears that potential NaN values might arise when ( m ) (row index) is much larger than ( n ) (column index), specifically in the bottom-left triangle of the matrix.

This PR corrects the comment to:

# this results in some NaN when m is much larger than n

Such a change might seem minor, but it aids in clarity and correctness for any developer reading the code in the future.