ahrefs / ocannl

OCANNL: OCaml Compiles Algorithms for Neural Networks Learning
BSD 2-Clause "Simplified" License
62 stars 2 forks source link

Study and incorporate Andrej Karpathy's `llm.c` lessons #253

Open lukstafi opened 5 months ago

lukstafi commented 5 months ago

"A few new CUDA hacker friends joined the effort and now llm.c is only 2X slower than PyTorch"

https://github.com/karpathy/llm.c

lukstafi commented 5 months ago

https://twitter.com/karpathy/status/1779354343013269929

lukstafi commented 5 months ago

https://twitter.com/karpathy/status/1781387674978533427 achieved parity with PyTorch FP32