NervanaSystems / neon

Intel® Nervana™ reference deep learning framework committed to best performance on all hardware
http://neon.nervanasys.com/docs/latest
Apache License 2.0
3.87k stars 811 forks source link

Object identification based on images #458

Open sushanthkapisthalam opened 6 years ago

sushanthkapisthalam commented 6 years ago

Im working on a data set with object identification based on their images. I have 25 classes, since im very new to Neural networks and coding can some one suggest me what all pre-processing techniques can be done before sending the images in the convolution layer. I also want to implement SIFT features followed by TF-IDF retrieval.

This is what I have done so far:

from keras .models import Sequential from keras.layers.core import Dense, Dropout, Activation, Flatten from keras.layers.convolutional import Convolution2D, MaxPooling2D from keras.optimizers import SGD,RMSprop,adam from keras.utils import np_utils

import numpy as np import matplotlib.pyplot as plt import matplotlib import os import theano from PIL import Image from numpy import *

SKLEARN

from sklearn.utils import shuffle from sklearn.cross_validation import train_test_split

input image dimensions

img_rows, img_cols = 200, 200

number of channels

img_channels = 1

Data

path1 = 'C:\Users\mails\Desktop\Frieburg\input_images' path2 = 'C:\Users\mails\Desktop\Frieburg\images-resized'

listing = os.listdir(path1) num_samples=size(listing) print (num_samples)

for file in listing: im = Image.open(path1 + '\' + file)
img = im.resize((img_rows,img_cols)) gray = img.convert('L')

need to do some more processing here

gray.save(path2 +'\\' +  file, "PNG")

imlist = os.listdir(path2)

im1 = array(Image.open('C:\Users\mails\Desktop\Frieburg\images-resized' + '\'+ imlist[0])) # open one image to get size m,n = im1.shape[0:2] # get the size of the images imnbr = len(imlist) # get the number of images

create matrix to store all flattened images

immatrix = array([array(Image.open('C:\Users\mails\Desktop\Frieburg\images-resized'+ '\' + im2)).flatten() for im2 in imlist],'f')

label=np.ones((num_samples,),dtype = int) label[0:136]=0 label[136:297]=1 label[297:669]=2 label[669:947]=3 label[947:1128]=4 label[1128:1435]=5 label[1435:1733]=6 label[1733:1830]=7 label[1830:1940]=8 label[1940:2049]=9 label[2049:2234]=10 label[2234:2475]=11 label[2475:2777]=12 label[2777:2939]=13 label[2939:3107]=14 label[3107:3250]=15 label[3250:3422]=16 label[3422:3572]=17 label[3572:3749]=18 label[3749:3956]=19 label[3956:4074]=20 label[4074:4357]=21 label[4357:4528]=222 label[4528:4685]=23 label[4685:]=24

data,Label = shuffle(immatrix,label, random_state=2) train_data = [data,Label]

img=immatrix[217].reshape(img_rows,img_cols) plt.imshow(img) plt.imshow(img,cmap='gray') print (train_data[0].shape) print (train_data[1].shape)

batch_size to train

batch_size = 128

number of output classes

nb_classes = 25

number of epochs to train

nb_epoch = 20

number of convolutional filters to use

nb_filters = 32

size of pooling area for max pooling

nb_pool = 2

convolution kernel size

nb_conv = 3

(X, y) = (train_data[0],train_data[1])

STEP 1: split X and y into training and testing sets

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=4)

X_train = X_train.reshape(X_train.shape[0], 1, img_cols, img_rows) X_test = X_test.reshape(X_test.shape[0], 1, img_cols, img_rows)

X_train = X_train.astype('float32') X_test = X_test.astype('float32')

X_train /= 255 X_test /= 255

print('X_train shape:', X_train.shape) print(X_train.shape[0], 'train samples') print(X_test.shape[0], 'test samples')


convert class vectors to binary class matrices

Y_train = np_utils.to_categorical(y_train, nb_classes) Y_test = np_utils.to_categorical(y_test, nb_classes)

convert class vectors to binary class matrices

Y_train = np_utils.to_categorical(y_train, nb_classes) Y_test = np_utils.to_categorical(y_test, nb_classes)

i = 100 plt.imshow(X_train[i, 0], interpolation='nearest') print("label : ", Y_train[i,:])

(When Im trying to do cross entropy I get this error)

IndexError: index 222 is out of bounds for axis 1 with size 25

Can some one suggest me what should be done before sending the images to convolution layer.

Thanks in advance!