tensorflow / addons

Useful extra functionality for TensorFlow 2.x maintained by SIG-addons
Apache License 2.0
1.69k stars 611 forks source link

changing the state matrix spectral radius doesn't affect the ESN behaviour #2811

Closed pfaz69 closed 1 year ago

pfaz69 commented 1 year ago

System information

Describe the bug When changing the spectral radius of the ESN state matrix, a following dense layer learns always the same weights, while such weights should change to reflect the changed ESN state matrix.

Code to reproduce the issue

Import the relevant modules

import numpy as np import matplotlib.pyplot as plt from keras.models import Sequential from keras.layers import Dense from tensorflow_addons.layers import ESN from tensorflow_addons.rnn import ESNCell from keras.layers import RNN from sklearn.preprocessing import MinMaxScaler from tensorflow import random as rnd

Fix the seed

rnd.set_seed(0)

The dataset can be downloaded from https://mantas.info/wp/wp-content/uploads/simple_esn/MackeyGlass_t17.txt

data = np.loadtxt('MackeyGlass_t17.txt')

Normalize

scaler = MinMaxScaler(feature_range=(0, 1)) scaled = scaler.fit_transform(data.reshape(-1, 1))

Split Dataset in Train and Test

train, test = scaled[0:-100], scaled[-100:]

Split into input and output

train_X, train_y = train[:, :-1], train[:, -1:] test_X, test_y = test[:, :-1], test[:, -1:]

Reshaping

train_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1])) test_X = test_X.reshape((test_X.shape[0], 1, test_X.shape[1]))

Batch and epochs

batch_size = 20 epochs = 15

changing spectral_radius doesn't affect the weights of the dense layer

spectral_radius = 0.9 #Try 0.1 or any other value

---------------------------------------------------------------------------

Design and run the model

model = Sequential()

model.add(ESN(units = 12, spectral_radius = spectral_radius, leaky=0.75, connectivity = 0.9)) # this line works exactly like the next one

model.add(RNN(ESNCell(12, spectral_radius = spectral_radius, leaky=0.75, connectivity = 0.9))) model.add(Dense(train_y.shape[1])) model.compile(loss='huber', optimizer='adam')

model.fit(train_X, train_y, epochs=epochs, batch_size=batch_size, validation_data=(test_X, test_y), verbose=0, shuffle=False)

Print the weights of the dense layer

print(model.layers[1].get_config(), model.layers[1].get_weights())

Other info / logs

Checking the call() method of class ESNCell (file: tensorflow_addons/rnn/esn_cell.py) it appears that the argument state is always passed as a tuple full of zeros. This way even though the state matrix is correctly generated (_self.recurrentkernel) its contribution gets always erased.

The dense layer weights I get (either with spectral_radius = 0.9 or 0.1) are:

[array([[-0.4414041 ], [-0.24732435], [ 0.48309386], [-0.11242288], [-0.35881412], [-0.37281674], [ 0.13455302], [ 0.44686365], [ 0.13275969], [-0.01452708], [-0.52097714], [-0.12454855]], dtype=float32), array([0.55140686]

pfaz69 commented 1 year ago

This is due to this piece of code that prescribes only one time step for the input:

#Split into input and output
train_X, train_y = train[:, :-1], train[:, -1:]
test_X, test_y = test[:, :-1], test[:, -1:]

#Reshaping
train_X = train_X.reshape((train_X.shape[0], 1, train_X.shape[1]))
test_X = test_X.reshape((test_X.shape[0], 1, test_X.shape[1]))

so it should be replaced by:

def get_XY(dat, time_steps):
    # Indices of target array
    Y_ind = np.arange(time_steps, len(dat), time_steps)
    Y = dat[Y_ind]
    # Prepare X
    rows_x = len(Y)
    X = dat[range(time_steps*rows_x)]
    X = np.reshape(X, (rows_x, time_steps, 1))    
    return X, Y

time_steps = 12
train_X, train_y = get_XY(train, time_steps)
test_X, test_y = get_XY(train, time_steps)