SelimaC / large-scale-sparse-neural-networks

11 stars 1 forks source link

cifar10: ValueError: could not interpret dimensions #4

Open dsmic opened 2 years ago

dsmic commented 2 years ago

With fashionmnist everything is fine, with cifar10:

python parallel_training.py --dataset cifar10 Using TensorFlow backend. 0000:00:01.177 P 0:-:- [INFO] Model creation time: 0.15073871612548828 0000:00:01.178 S 0:-:- [INFO] beginning epoch 1 Traceback (most recent call last): File "parallel_training.py", line 235, in histories = manager.process.train() File "/home/detlef/tmp/large-scale-sparse-neural-networks/wasap_sgd/mpi/single_process.py", line 36, in train self.update = self.model.train_on_batch(x=batch[0], y=batch[1]) File "/home/detlef/tmp/large-scale-sparse-neural-networks/wasap_sgd/train/model.py", line 330, in train_on_batch z, a, masks = self._feed_forward(x, True) File "/home/detlef/tmp/large-scale-sparse-neural-networks/wasap_sgd/train/model.py", line 221, in _feed_forward z[i + 1] = a[i] @ self.w[i] + self.b[i] File "/home/detlef/anaconda3/envs/tf/lib/python3.7/site-packages/scipy/sparse/base.py", line 566, in rmatmul return self.rmul(other) File "/home/detlef/anaconda3/envs/tf/lib/python3.7/site-packages/scipy/sparse/base.py", line 550, in rmul return (self.transpose() * tr).transpose() File "/home/detlef/anaconda3/envs/tf/lib/python3.7/site-packages/scipy/sparse/base.py", line 526, in mul raise ValueError('could not interpret dimensions') ValueError: could not interpret dimensions

Any help would be great :)

dsmic commented 2 years ago

Just in case somebody also has this problem, datasets have to be reshaped

def load_mnist_data(n_training_samples, n_testing_samples):

    # read CIFAR10 data
    (x, y), (x_test, y_test) = mnist.load_data()

    y = np_utils.to_categorical(y, 10)
    y_test = np_utils.to_categorical(y_test, 10)
    x = x.astype('float32')
    x_test = x_test.astype('float32')

    index_train = np.arange(x.shape[0])
    np.random.shuffle(index_train)

    index_test = np.arange(x_test.shape[0])
    np.random.shuffle(index_test)

    x_train = x[index_train[0:n_training_samples], :]
    y_train = y[index_train[0:n_training_samples], :]

    x_test = x_test[index_test[0:n_testing_samples], :]
    y_test = y_test[index_test[0:n_testing_samples], :]

    # Normalize data
    x_train = x_train / 255.
    x_test = x_test / 255.
    x_train = x_train.astype('float32')
    x_test = x_test.astype('float32')
    y_test = y_test.astype('float32')
    y_train = y_train.astype('float32')
    x_train = x_train.reshape(-1,28*28)
    x_test = x_test.reshape(-1,28*28)

    return x_train, y_train, x_test, y_test

or the model can be fixed, which seems to run cifar10 as well

        for i in range(1, self.n_layers):
            a[i] = a[i].reshape(a[i].shape[0], -1)
            z[i + 1] = a[i] @ self.w[i] + self.b[i]
            a[i + 1] = self.activations[i + 1].activation(z[i + 1])