Open timdebruin opened 5 years ago
As an example of the kind of confusion this might prevent:
The zoo example code that deals with exactly this point uses get layer by name:
outputs=base_model.get_layer("average_pooling2d_8").output
The example code fragment does not actually work in isolation since the the network only has 3 average_pooling2d
layers. It does work when following the examples one by one in the same session as this is the third example with the same network, so by creating the previous 2 there are enough average pooling layers to get to number 8.
I thought this would be the standard from now on, but I see QuickNet unfortunately still has layer names like "add_3" and "quant_conv2d_5". Do we or do we not enforce this policy?
I am still very much in favor of this, as I am still running into cases (knowledge distillation) that need this. I had forgotten I was assigned to this. I am a little busy at the moment but since I need this for KD I will make a PR at some point to the models in zoo that do not have this.
After trying to consistently use this in my own code, I am less sure if we should adopt explicit layer naming because I found it did make my code somewhat less readable. To summarize:
Pros:
Cons:
resnet_block()
etc.) and occasionally forces the use of Lambda
functions.Personally, I think readable code strongly outweighs a readable model summary, but the building-block argument is probably decisive here. I would like to adopt a consistent policy for all of zoo
. Please share any objections to the above reasoning; otherwise let's proceed with adding this for all models.
Models which currently have explicit layers names, so no auto-generated names, based on the model summaries embedded in the docs:
sota
QuickNet
QuickNetLarge
QuickNetXL
literature
BinaryAlexNet
BiRealNet
BinaryResNetE18
BinaryDenseNet28
BinaryDenseNet37
BinaryDenseNet37Dilated
BinaryDenseNet45
DoReFaNet
MeliusNet22
RealToBinaryNet
XNORNet
Feature motivation
When using the ready made models as parts of bigger networks it can be necessary to get the outputs of specific layers. One example are encoder-decoder style networks with skip connections. The keras
model.get_layer
api allows getting layers by name or by index. While getting layers by index works, often specific layers are needed regardless of the exact network architecture; the last layer of a block before a resolution change for instance. Currently, switching architectures (e.g.Densenet28
->Densenet45
) would change both the layer indices as well as the names of the layers.Feature description
By passing explicit names when constructing the keras layers we could make sure that these specific layers always have the same predictable names, and can therefore easily be retrieved by name even when changing architectures. Retrieving them by name would be less error prone than by index. Currently, since no explicit names are passed, the layer names are very general (e.g.
add_idx
whereidx
depends on the total number of keras layers of that type that have been constructed so far).Feature implementation
An example implementation can be seen in the keras applications code.
Considerations
Depending on what the zoo models will be used for, this might add more bloat than it prevents. However, the code that it adds also helps in documenting the models to some extent, for instance by explicitly passing block names that are also in corresponding papers.