Adding support for LoRa

I am using this issue to think through the options of adding support for LoRa.

The first fundamental question is: Can we do this without changing the model code itself? I.e., let's assume we are not allowed to touch convnext.py. How would we proceed?

Ultimately our goal would be to replace (all? some?) tf.keras.layers.Conv2D layers with our new LoraConv2D layers. We could try and do this via monkey patching.

Note: All code samples are based on the convnext-tag branch.

import tensorflow as tf
from tfimm.architectures import convnext

class LoraConv2D(tf.keras.layers.Conv2D):
    ...

def main():
    cls, cfg = convnext.convnext_atto()

    # Monkey-patching conv layer
    old_conv_layer = convnext.tf.keras.layers.Conv2D
    convnext.tf.keras.layers.Conv2D = LoraConv2D

    model = cls(cfg=cfg)
    model(model.dummy_inputs)

    # Reversing changes. This would become a context manager of course.
    convnext.tf.keras.layers.Conv2D = old_conv_layer

    # stem[0] is the first convolutional layer in the stem
    print(type(model.stem[0]))

if __name__ == "__main__":
    main()

This works, but it has some drawbacks:

We have no layer-wise control. We can either swap all layers from Conv2D to LoraConv2D or none. And all layers will use the same parameters, i.e., we cannot use different values for r for queries, keys and values (in a transformer, not applicable here).
We can't be sure that we reach all Conv2D layers. By default this only affects code in convnext.py itself. In fact, most convolutional layers in ConvNeXt are implemented as part of MLP layers in tfimm/layers/transformers.py. This is surmountable, since we can patch Conv2D in all files belonging to tfimm. Implemented as a central context manager the complexity is manageable. It would be a problem, if we were to rely on external code for layers/blocks, but that is not the case at the moment.

We would like to achieve layer-wise control, i.e., swap only some layers, but not others. We could do that by specifying layers to be swapped by their names.

When monkey patching we could set convnext.tf.keras.layers.Conv2D = conv_layer_factory with conv_layer_factory being a smart function which returns either a Conv2D layer or a LoraConv2D layer depending on what is needed. Unfortunately, I don't think this function has the necessary context to assemble the full layer name: The full nested name is generated via tf.name_scope when the layer is built in build(), not when it is defined in __init__(). Sample factory code:
```
def conv_layer_factory(*args, **kwargs):
  print(tf.get_current_name_scope(), kwargs["name"])
  return LoraConv2D(*args, **kwargs)
```
One way to achieve that is model surgery. Here is some example code for inserting or swapping layers. But this code creates a new functional model. Ideally, we would modify our model in place.

martinsbruveris / tensorflow-image-models

Adding support for LoRa #85