Developing a new network and get weights from previous layers

NifTK / NiftyNet

[unmaintained] An open-source convolutional neural networks platform for research in medical image analysis and image-guided therapy

http://niftynet.io

Apache License 2.0

1.36k stars 403 forks source link

Developing a new network and get weights from previous layers #414

Closed andreasstefani closed 5 years ago

andreasstefani commented 5 years ago

Hello,

currently I am developing a new network using NiftyNet and would need some help.

I am trying to implement an Autofocus Layer [1] as proposed in the paper. However, at a certain point I need the weights (w) from a previous convolutional layer (conv1) for weight sharing to calculate K (K=4) parallel convolutions each using that weights (w) from (conv1).

Is there a way to read the weights from a convolutional layer and then use the weights to create four new convolutional layers with those weights?

EDIT: Can I assume, when giving two different layers that they will share the same weights? That would solve my problem with the parallel convolutions, which need the same weights.

Thank you in advance. Andreas

[1] https://arxiv.org/pdf/1805.08403.pdf

Zach-ER commented 5 years ago

I think I see what you're trying to do:

You can instantiate a convolutional layer conv_layer and once you've done that, there's nothing stopping you from using it multiple times on different tensors. So I think the answer to your problem is to do that: please let me know if I've got it wrong somehow.

andreasstefani commented 5 years ago

Hi @Zach-ER, that will not work because each (parallel) conv layer has a different dilation rate. Hence, I need to instantiate 4 conv layers. But, these 4 conv layers need to have the same weight - otherwise, the outputs for each conv layer would be totally wrong and not relate with each other.

wyli commented 5 years ago

Hi @Zach-ER, that will not work because each (parallel) conv layer has a different dilation rate. Hence, I need to instantiate 4 conv layers. But, these 4 conv layers need to have the same weight - otherwise, the outputs for each conv layer would be totally wrong and not relate with each other.

I don't think that's true because convs with different dilation rates could be implemented as using the same set of conv kernels and rearranging convs' input e.g. https://github.com/NifTK/NiftyNet/blob/dev/niftynet/network/highres3dnet.py#L155

andreasstefani commented 5 years ago

Hi @Zach-ER, that will not work because each (parallel) conv layer has a different dilation rate. Hence, I need to instantiate 4 conv layers. But, these 4 conv layers need to have the same weight - otherwise, the outputs for each conv layer would be totally wrong and not relate with each other.

I don't think that's true because convs with different dilation rates could be implemented as using the same set of conv kernels and rearranging convs' input e.g. https://github.com/NifTK/NiftyNet/blob/dev/niftynet/network/highres3dnet.py#L155

Of course, you are totally right. I was interpreting the first answer in a completely different way - my fault. Your suggestions should work and fix the problem.

If I understand you correctly, I simply create 4 tensors with different dilation rates (2, 6, 10, 14) instead of 4 dilated convolutions. Using these 4 tensors as input for the single conv layer (kernel=3) one after another would simulate the parallelism I need.

wyli commented 5 years ago

exactly @andreasstefani. I'm closing this issue now. would be great to have your PR of this autofocus module to niftynet:)