oinegue / scatnetgpu

Scatterin Network for Python and CUDA
MIT License
16 stars 3 forks source link

Can you provide an example on CIFAR or any image dataset using scatnetgpu? #2

Open deepakanandece opened 6 years ago

deepakanandece commented 6 years ago

Hello, Can you provide an example on CIFAR or any image dataset using scatnetgpu?

deepakanandece commented 6 years ago

Also the number of output channels is not given..new parameters M and L is different from pyscatwave repo?

oinegue commented 6 years ago

Hi, thanks for your interest in this library. To get the scattering network representation of the CIFAR dataset, firstly you should load the dataset, for example from keras (any other source is ok as long as you have your images as numpy arrays). Now say that you have the train set in x_train with shape (num_samples, 3, 32, 32). This library works with channels as last axis, so move them with x_train = np.moveaxis(x_train, 1, -1). Finally you can get the scatnet representation with the example code provided in the README, so the full code could be:

from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = np.moveaxis(x_train, 1, -1)
x_test = np.moveaxis(x_test, 1, -1)

from scatnetgpu import ScatNet
sn = ScatNet(M=2, J=4, L=6)
sn_train = []
for img in x_train:
  sn_train.append(sn.transform(img))
sn_test = []
for img in x_test:
  sn_test.append(sn.transform(img))

Please note that scattering networks provides only a representation of your data. A classifier (e.g. SVM) is required if you want to train a model.

For the second question, as I said on the README this library is compatible with the original scatnet library. I've never used pyscatwave so I can't help you about the different output shape.

Anyway, you could look at my thesis work that used this library, maybe it could guide you a little.

deepakanandece commented 6 years ago

Thanks for prompt reply. The transform step returns a list of list. It has several keyword as indices. Which part is the features extracted from images? Transform gives a list of length one which contains lists of (3,3) which is further bifurcated. As far as i know none of the machine learning packages takes lists as input. Can you give some documentation as in which index corresponds to what?

oinegue commented 6 years ago

This particular output structure was chosen to keep the compatibility with the original matlab library.

First list is long 3 because data have 3 channels. Each channel is transformed independently so you have one list for each channel. At second level, you have 3 lists because scatnet gives you M+1 representations. For each element at this level there is a structured array that contains transformed signals (key signal) as well as path of applied filters (keys j and l).

If you just want to use the scatnet representation for classification, you may want to use the stack_scat_output method that returns all signals and all paths. Just .flatten the returned signals and you will have a features vector.

deepakanandece commented 6 years ago

Hello, I would like to use the scanet output as multi-channel image, rather than flatenning. Flatenning will distroy the spatial relation. Can you suggest the way in which i get output as a scaled and thick(multi-channel) image?

oinegue commented 6 years ago

Just use stack_scat_output(sn.transform(img)) . It returns a tuple where the first element is what you are looking for. Please look at the source code to understand the insights.