keras-team / keras-preprocessing

Utilities for working with image data, text data, and sequence data.
Other
1.02k stars 444 forks source link

Properly Using Custom Generator for Binary Classifier! #296

Closed Lincoln93 closed 4 years ago

Lincoln93 commented 4 years ago

I've a data set of the following format. It's a binary classification problem (0,1). For this I've made a custom generator down below but couldn't understand how should I properly encode the label there.

- images/
    - a.png
    - b.png
    - c.png
- images.csv
img_Id - - label
a.png - - 1
b.png - - 0
c.png - - 1

Generator

class generate(Sequence):
    def __init__(self, data, dim, shuffle, batch_size):
        self.data = data
        .....

    def __len__(self):
        return int(np.floor(len(self.data)/self.batch_size))

  def __getitem__(self, index):
        batch_idx = self.indices[index*self.batch_size:(index+1)*self.batch_size]
        idx = [self.list_idx[k] for k in batch_idx]
        Data = np.empty((self.batch_size, *self.dim, 1))
        Target = np.empty((self.batch_size, 1), dtype = int)

        for i, k in enumerate(idx):
            # load the image file using cv2
            image = cv2.imread(self.data['image_name'][k])
            image = cv2.resize(image, self.dim) 
            # expand the axises 
            Data[i,:, :, :] =  image
            Target[i,:] = ??????????????????????????????

        return Data Target

    def on_epoch_end(self):
        -----

Call

generator = generate(df, ....)
img, lable = generator[0]
img <- OK
label <- ?????????????

The model should output with sigmoid activation on the last layer. How properly I can encode label here and match properly with the image file?? Is there any more convenient way to do this??

Additional Query:

  1. Why sometimes predict_generator doesn't return all samples during operation. Asked here but I din't get the answer. :(

  2. I am trying to take this issue as a learning opportunity. So just one more related question. If My images.label attributes contains multi-classes, what should I changes to make it work. Or, multi-label?

Thanks in advance.