larq / zoo

Reference implementations of popular Binarized Neural Networks
https://docs.larq.dev/zoo
Apache License 2.0
104 stars 18 forks source link

Explicit layer naming #61

Open timdebruin opened 5 years ago

timdebruin commented 5 years ago

Feature motivation

When using the ready made models as parts of bigger networks it can be necessary to get the outputs of specific layers. One example are encoder-decoder style networks with skip connections. The keras model.get_layer api allows getting layers by name or by index. While getting layers by index works, often specific layers are needed regardless of the exact network architecture; the last layer of a block before a resolution change for instance. Currently, switching architectures (e.g. Densenet28 -> Densenet45) would change both the layer indices as well as the names of the layers.

Feature description

By passing explicit names when constructing the keras layers we could make sure that these specific layers always have the same predictable names, and can therefore easily be retrieved by name even when changing architectures. Retrieving them by name would be less error prone than by index. Currently, since no explicit names are passed, the layer names are very general (e.g. add_idx where idx depends on the total number of keras layers of that type that have been constructed so far).

Feature implementation

An example implementation can be seen in the keras applications code.

Considerations

Depending on what the zoo models will be used for, this might add more bloat than it prevents. However, the code that it adds also helps in documenting the models to some extent, for instance by explicitly passing block names that are also in corresponding papers.

timdebruin commented 5 years ago

As an example of the kind of confusion this might prevent:

The zoo example code that deals with exactly this point uses get layer by name:

outputs=base_model.get_layer("average_pooling2d_8").output

The example code fragment does not actually work in isolation since the the network only has 3 average_pooling2d layers. It does work when following the examples one by one in the same session as this is the third example with the same network, so by creating the previous 2 there are enough average pooling layers to get to number 8.

jneeven commented 4 years ago

I thought this would be the standard from now on, but I see QuickNet unfortunately still has layer names like "add_3" and "quant_conv2d_5". Do we or do we not enforce this policy?

timdebruin commented 4 years ago

I am still very much in favor of this, as I am still running into cases (knowledge distillation) that need this. I had forgotten I was assigned to this. I am a little busy at the moment but since I need this for KD I will make a PR at some point to the models in zoo that do not have this.

koenhelwegen commented 4 years ago

After trying to consistently use this in my own code, I am less sure if we should adopt explicit layer naming because I found it did make my code somewhat less readable. To summarize:

Pros:

Cons:

Personally, I think readable code strongly outweighs a readable model summary, but the building-block argument is probably decisive here. I would like to adopt a consistent policy for all of zoo. Please share any objections to the above reasoning; otherwise let's proceed with adding this for all models.

leonoverweel commented 4 years ago

Models which currently have explicit layers names, so no auto-generated names, based on the model summaries embedded in the docs: