google / gemma.cpp

lightweight, standalone C++ inference engine for Google's Gemma models.
Apache License 2.0
5.8k stars 491 forks source link

Add first version of backpropagation support. #203

Closed szabadka closed 1 month ago

szabadka commented 1 month ago

This is still in progress / experimental, currently it is only implemented for normal gemma MQA attention layers, and no parallelism is added yet for backward pass.

Since we need to remember all activations from all layers, the forward pass was also reimplemented with a new activation data structure.

szabadka commented 1 month ago

Wow! That's a lot of new code :) Consider moving the backward/forward files to a subdirectory?

Done.