keras-team / keras-preprocessing

Utilities for working with image data, text data, and sequence data.
Other
1.02k stars 444 forks source link

The "transpose trick" for quick zca approximation solving issue #55 #292

Open Nestak2 opened 4 years ago

Nestak2 commented 4 years ago

I implemented the more or less known "transpose trick" for a fix of issue #55, my code is here. Is it alright? Do I have to include changes not only to keras-preprocessing/keras_preprocessing/image/image_data_generator.py, but also to keras/keras/preprocessing/image.py?

Summary

Included variable zca_rotated that allows the user to choose if the svd in the zca whitening should be calculated on the data rows=features (False) or on the columns=examples (True) and saves time when the smaller one is chosen. When False is selected it calculates zca_whitening in the usual way. In some cases it can solve the problem of never completing svd computation when the covariance matrix is too large by reducing the size of the matrix. The idea is explained with examples in issue #55 on github. The method is not an approximation, but it gives exactly the same result as the current method with the difference of saving time if the number of images < number of features. I needed only to make changes in a few lines of code, but for them to work I am not sure if I have to add changes to keras/keras/preprocessing/image.py like defining the new variable zca_rotated there.

Related Issues

55, #8706 in keras-team/keras

PR Overview

from keras.datasets import cifar10
import numpy as np
from scipy import linalg

(X_train, y_train), (X_test, y_test) = cifar10.load_data()
X = X_train[:1000]
flat_x = np.reshape(x, (x.shape[0], x.shape[1] * x.shape[2] * x.shape[3]))

# line below lets you choose if you want to calculate the standard zca, like in keras (=False)
# or if you want to treat the input data X as rotated to simplify the calculation if X.shape[0]=m < X.shape[1]=n
zca_rotated = True

# normalize x:
flat_x = flat_x / 255.
flat_x = flat_x - flat_x.mean(axis=0)

# CHANGES HAPPEN BELOW.
# if m>n execute the svd as usual
if zca_rotated=False:
    sigma = np.dot(flat_x.T, flat_x) / flat_x.shape[0]
    u, s, _ = linalg.svd(sigma)
# and if m<n do the trnaspose trick
if zca_rotated=True:
    sigma = np.dot(flat_x, flat_x.T) / flat_x.shape[0]
    u, s, _ = linalg.svd(sigma)
    u = np.dot(flat_x.T, u) / np.sqrt(s*flat_x.shape[0])

s_inv = 1. / np.sqrt(s[np.newaxis] + 0.1) # the 0.1 is the epsilon value for zca
principal_components = (u * s_inv).dot(u.T)
whitex = np.dot(flat_x, principal_components)
Dref360 commented 4 years ago

Hello,

With ZCA whitening not being as popular as at the beginning of Deep learning, I don't think we should include this in Keras-prepro.

I'm open to discussion :)

Nestak2 commented 4 years ago

Thanks for the answer, @Dref360 ! Is there any detriment, if this zca change would be included in Keras-prepro? Maybe zca isn't that popular for keras users, because the current calculation is very long for large images (just speculating here)? Cheers

Nestak2 commented 4 years ago

Hi @Dref360 , I found a way to make my method not just an approximation but exactly the same result as the current zca transformation and still to run quicker. I can also make it run without an additional user specified variable, with an if-clause the code can just determine what would be the quicker svd approach and calculate accordingly. It would look like this, just 5 new lines:

# CHANGES HAPPEN BELOW.
# if m>n execute the svd as usual
if flat_x.shape[0] => flat_x.shape[1]:
    sigma = np.dot(flat_x.T, flat_x) / flat_x.shape[0]
    u, s, _ = linalg.svd(sigma)
# and if m<n do the trnaspose trick
if flat_x.shape[0] < flat_x.shape[1]:
    sigma = np.dot(flat_x, flat_x.T) / flat_x.shape[0]
    u, s, _ = linalg.svd(sigma)
    u = np.dot(flat_x.T, u) / np.sqrt(s*flat_x.shape[0])

Let me know if you think it's valuable and I will make the changes to the keras pre-pro file and submit them.

tranngocphu commented 1 year ago

Hi, @Nestak2 Can you please explain why we need to divide by np.sqrt(s*flat_x.shape[0]) in the last line of the above code? Thank you.

Nestak2 commented 1 year ago

@tranngocphu Hi, if I remember correctly np.sqrt(s*flat_x.shape[0]) is a normalization factor for the vectors u and is connected to their eigenvalues. So you might be able to just leave it out. You can find versions of the factor without the term flat_x.shape[0], so only np.sqrt(s), if sigma is calculated before as sigma=np.dot(flat_x,flat_x.T). I wrote a little Medium article a few years ago about the transpose trick, it doesn't give more explanation about this factor, but it shows the reader the mathematical justification for the transpose trick and gives a bit more background. If you want to read it it is here