Montinger / Transformer-Workbench

Playground for Transformers
42 stars 16 forks source link

Implementation of `LoRA-from-scratch` in keras #1

Open inconnu11 opened 9 months ago

inconnu11 commented 9 months ago

Hi, thanks for the detailed tutorial of LoRA-from-scratch which helped me to understand the underlying priciples of the LoRA technique. Are there keras implementation of from-scratch LoRA for the RoBERTa model ? I am implementing the replace_multihead_attention_recursion of LoraWrapperRoberta.py in keras. But I don't know the right and complete practice in keras. I replaced the code with keras apis. It reported some errors and I could not find a way out.

Montinger commented 1 week ago

Its pretty much optimized to work with pytorch directly. I'm not sure how keras wraps around that or how it uses the code snippets. Did you use the pytorch backend?