When we compute with lora adapters, there are some situations where we can just prefer to skip the computation. One of these is if r==0 (which was already addressed) and one is when then scaling parameter (or equivalently the lora_alpha param) is zero. This was only addressed partially.
Here we centralize the logic to identify only the adapter (names) that are worthwile for compute into the LoraLayer class, so all classes that extend it can have a consiste way of identifying which layers to run computation for
When we compute with lora adapters, there are some situations where we can just prefer to skip the computation. One of these is if
r==0
(which was already addressed) and one is when thenscaling
parameter (or equivalently thelora_alpha
param) is zero. This was only addressed partially.Here we centralize the logic to identify only the adapter (names) that are worthwile for compute into the
LoraLayer
class, so all classes that extend it can have a consiste way of identifying which layers to run computation for