NEGU93 / cvnn

Library to help implement a complex-valued neural network (cvnn) using tensorflow as back-end
https://complex-valued-neural-networks.readthedocs.io/
MIT License
164 stars 34 forks source link

I'm not getting complex valued output #21

Closed DiegoLigtenberg closed 2 years ago

DiegoLigtenberg commented 2 years ago

For my thesis I want to do Music Source separation.

To do this, the input of my model is a complex output of the stft The output should also be a predicted stft for a separate source.

I think this complex implementation should work. but when testing, I'm not able to get complex output. Could you please help me? maybe even have a call because I'm really struggling with this project

my email is diegoligtenberg@gmail.com

NEGU93 commented 2 years ago

Hello,

Sorry for the delayed response. Can you give me a MWE please?

It may be the activation function you are using at the end. Another option is that you are not using the complex_input as input. But I need more information other than just suppose things.

Regards,

DiegoLigtenberg commented 2 years ago

Hello, Thanks for your quick reply!

Please see attached MWE.py.

I made a mnist example, where I casted the data into complex tensors by just duplicating the real component to the imaginary component. Although this does not make sense, the output dimensionality should still be two instead of one what it currently is.

Regards, Diego Ligtenberg

Op do 27 jan. 2022 om 10:29 schreef J Agustin Barrachina < @.***>:

Hello,

Sorry for the delayed response. Can you give me a MWE https://en.wikipedia.org/wiki/Minimal_working_example please?

It may be the activation function you are using at the end. Another option is that you are not using the complex_layer as input. But I need more information other than just suppose things.

Regards,

— Reply to this email directly, view it on GitHub https://github.com/NEGU93/cvnn/issues/21#issuecomment-1023011703, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALVS25I6UJJEPTOUGXPNUVLUYEGBBANCNFSM5MYIOJUQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you authored the thread.Message ID: @.***>

NEGU93 commented 2 years ago

First of all, I am deeply sorry for my late replies, I am under a lot of work recently, I will try to answer you more frequently.

On the other hand, I can't see any attached file. I am not even sure you can attach any to the github forum.

DiegoLigtenberg commented 2 years ago

Please don't stress about it, I'm in no rush at all!

import tensorflow as tf
from tensorflow.keras.datasets import mnist
import numpy as np
import cvnn
from cvnn import layers
from tensorflow.keras.layers import Flatten, Conv2D, Dense, ReLU, Softmax,Activation,MaxPooling2D
from tensorflow.keras.models import Sequential
from tensorflow.keras.losses import MeanSquaredError, SparseCategoricalCrossentropy
from cvnn.losses import ComplexAverageCrossEntropy

def gpu_fix():
    '''fixes gpu issues on my pc'''
    config = tf.compat.v1.ConfigProto(gpu_options =  tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=0.8))
    config.gpu_options.allow_growth = True
    session = tf.compat.v1.Session(config=config)
    tf.compat.v1.keras.backend.set_session(session)

def load_mnist():
    '''load mnist  data and get it into complex shape'''
    (x_train, y_train),(x_test,y_test) = mnist.load_data()
    x_train = x_train.astype("float32")/255
    y_train = y_train.astype("float32")
    x_train = x_train[...,np.newaxis]

    x_test = x_test.astype("float32")/255
    x_test = x_test[...,np.newaxis]    

    # force the train and test dataset into complex values (even though it does not make sense for this dataset)
    x_train = tf.complex(x_train,x_train*3)
    y_train = tf.complex(y_train,y_train*2)
    return x_train,y_train,x_test,y_test

if __name__=="__main__":
    # comment this if you don't need it.
    gpu_fix() 
    # load data
    x_train,y_train,_,_ = load_mnist()    

    # building the model
    model = Sequential()
    model.add(layers.ComplexConv2D(32,(3, 3), input_shape=(28, 28, 1), dtype=np.float32))
    model.add(Activation("relu"))
    model.add(layers.ComplexFlatten(input_shape=(28, 28, 1), dtype=np.float32))
    model.add(Activation("relu"))
    model.add(layers.ComplexDense(128, dtype=np.float32))
    model.add(Activation("relu"))
    model.add(layers.ComplexDense(10, activation='softmax', dtype=np.float32))
    model.compile(loss='sparse_categorical_crossentropy', optimizer=tf.keras.optimizers.Adam(0.0001),metrics=['accuracy'],)
    model.fit(x_train[:1000],y_train[:1000], epochs=10, shuffle=False)

    # predicts only the first complex value
    print("actual number:\t\t",y_train[0:1])
    result = model.predict(x_train[0:1])
    print("result is:\t\t", (result))
NEGU93 commented 2 years ago

So the error is that you are giving to each layer the input dtype=np.float32 basically telling the layer to use a real input. You must use dtype=np.complex64.

Also, beware that when doing that you must use activation functions that support complex input. You can see the available activation functions in this documentation.

This is also true for the loss but you can use some of the activation functions in here as the output activation function which has complex input but real output. The other option is to use a loss function defined for complex values such as the ones listed here.

noushinha commented 2 years ago

Hello @NEGU93 , first of all, thanks a lot for sharing this useful repository.

I have the very same problem with my 1D Convolutional Neural Network. Both inputs and outputs are complex values. I went through the examples of the repositories and I tried to fix the problem by using complex dtype. Also, I changed the activation function to be a complex one. However, I still get a real-valued result at the prediction time. Can you help me with this? I cannot set dtype on Flatten layer. I receive an unknown argument error which means the parameter is not defined for that layer type.

dtype = tf.complex64
init = 'ComplexGlorotUniform'
model = tf.keras.models.Sequential()
model.add(complex_layers.ComplexInput(input_shape=(45, 1)))  # Always use ComplexInput at the start
model.add(complex_layers.ComplexConv1D(8, 3, activation='cart_relu', padding='same', 
                                       strides=1, dtype=dtype, kernel_initializer=init ))
model.add(complex_layers.ComplexConv1D(16, 3, activation='cart_relu', padding='same', 
                                       strides=1, dtype=dtype, kernel_initializer=init ))
model.add(complex_layers.ComplexConv1D(32, 3, activation='cart_relu', padding='same', 
                                       strides=1, dtype=dtype, kernel_initializer=init ))
model.add(complex_layers.ComplexAvgPooling1D(2, 2, padding='valid', dtype=dtype ))
model.add(complex_layers.ComplexFlatten())
model.add(complex_layers.ComplexDense(50, activation='cart_relu', dtype=dtype ))
model.add(complex_layers.ComplexDense(10, activation='convert_to_real_with_abs', dtype=dtype)) 
model.summary()
opt = optimizers.Adam(learning_rate=0.001)
model.compile(loss=ComplexMeanSquareError(), optimizer=opt, metrics=['mae', 'accuracy'])
x = tf.cast(tf.random.normal((45, 1)), tf.complex64)
y = model(x)
assert y.dtype == tf.complex64
# model.fit(trainX, trainY, validation_split=.1, epochs=100, batch_size=20, verbose=1)
# [loss, mae] = model.evaluate(testX, testy, batch_size=1, verbose=0)
# predictions = model.predict(testX, batch_size=1, verbose=1)

This generates the first item of _predictions_ to be [3.3387803e-03 3.2509351e-03 2.6783077e-03 4.6574499e-04 1.5928188e-05 7.0847884e-05 4.9574948e-05 2.2670627e-04 1.1114680e-04 4.2280069e-04]. Also, accuracy is 0.7649999856948853. I tried using ComplexAccuracy as my metric but I receive an error. Any idea? I tried to assert the model using the available documentation. I receive the following error:

Matrix size-incompatible: In[0]: [45,0], In[1]: [704,50] [Op:MatMul] Call arguments received: • inputs=tf.Tensor(shape=(45, 0), dtype=complex64)

noushinha commented 2 years ago

Ok, I managed to solve the problem by changing the activation function in the classification (regression) layer from convert_to_real_with_abs to cart_tanh. I hope this helps someone else. Now the values of the prediction matrix for the first item are complex.

[ 3.9182748e-03-0.00048357j  3.0350857e-03-0.00220608j
  2.6440190e-03-0.00264915j  1.4658429e-04-0.00457654j
 -1.9804116e-03-0.00454792j -5.0613447e-03-0.00162821j
 -4.2987680e-03+0.00129276j -1.9447719e-03+0.00315477j
 -2.0118156e-03+0.00370069j  7.9019577e-05+0.0034916j ]
NEGU93 commented 2 years ago

I am glad you solved it!