NEGU93 / cvnn

Library to help implement a complex-valued neural network (cvnn) using tensorflow as back-end
https://complex-valued-neural-networks.readthedocs.io/
MIT License
164 stars 34 forks source link

Return types for Complex Initializers #18

Closed annabelleYan closed 2 years ago

annabelleYan commented 2 years ago

Hi, I'm reading the coding details in cvnn.initializers and have a question about it. Do all complex initializers return the real value (dtype=float32) rather than complex value (dtype = complex64)? I have tried following codes and all of them give me real-valued vectors or matrices. If that's the case, do you have any way to initialize kernel weight matrix or bias with complex values?

import cvnn
initializer = cvnn.initializers.ComplexGlorotNormal()
values = initializer(shape=(2, 2))
print(values.numpy())

print(cvnn.initializers.Zeros()(shape=(2,2)))

Thanks for your help. Cheers.

annabelleYan commented 2 years ago

I think my question has been solved when I read codes related to convolutional layers. Sorry to bother you.

NEGU93 commented 2 years ago

Actually, your question makes a lot of sense. It would be more logic to make cvnn.initializers.ComplexGlorotNormal() return a complex data type instead of a real one. And there is a reason I made it return a float. The reason is simply: Tensorflow optimizer raises an exception if trainable parameters (kernels, bias, weights) are complex.

My solution was, therefore, to have two variables, for example, kernel_r and kernel_i to make the real and imaginary part and simulate the operations. Some times I can even then create an intermediate variable kernel = tf.complex(kernel_r, kernel_i) and it works (some points there for tensorflow :triumph:).

You can see how I init weights either here or here for example.

The question is now, why don't I use the tensorflow initializer if I am going to init as real dtype? Why use my own initializer? The answer is xavier and he has properties of variance, that when using the TensorFlow initializer for both real and imaginary they loose those properties. In short, even though ComplexGlorotNormal() gives a float output, it is not equivalent as tf.initilizers.GlorotNormal() and the former should be use for complex layers if you want to maintain the good properties of these initializers.

annabelleYan commented 2 years ago

Thanks for your reply. You’ve helped make it clearer for me.