certik / fastGPT

Fast GPT-2 inference written in Fortran
MIT License
180 stars 16 forks source link

Make explicit copy of arrays #60

Closed certik closed 1 year ago

certik commented 1 year ago

In cases where the compiler does it anyway.

certik commented 1 year ago

With this PR, I am getting 0.301s. In main I am also getting 0.301s. The 124M.

The 1558M model, main: 3.640s. This PR: 3.639s (even 3.637s). Old master: 3.644s.

certik commented 1 year ago

I don't see any slowdown.