keras-team / keras

Deep Learning for humans
http://keras.io/
Apache License 2.0
61.93k stars 19.46k forks source link

image generator for localization tasks #4433

Closed cdicle closed 7 years ago

cdicle commented 7 years ago

Is there an image generator functionality for object detection tasks. Imagine we are detecting a single type of object and I have images with labels showing the bounding boxes for the objects per image. Does Keras have a generic function for those type of setups? If not is it a useful enhancement?

By the way, I am aware that I can crop the images for positive and negative classes, save them separately in folders and treat this problem as an image classification problem. But I am deliberately staying away from that option.

I would appreciate any comments. Best.

pengpaiSH commented 7 years ago

@cdicle I am also considering using Keras for object detection. However, I am struggling for how to combine the labeled bounding box with original images with data augmentation. Do you have any suggestions?

cdicle commented 7 years ago

Hi @pengpaiSH. I end up writing my own generator. My suggestion would be generating folders of crops, and using the keras directory iterator as in tutorial example https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html.

I am working on a image list iterator and I will try to extend it to a version with bounding boxes. That will take some time though since I am also fairly new to keras as well.

pengpaiSH commented 7 years ago

@cdicle Thank you for your suggestions. Btw, I am aware of using flow_from_directory() for data augmentation and generating batch of images. I have used it to implement a ranking of Top 5% image classification competition held by Kaggle. See https://github.com/pengpaiSH/Kaggle_NCFM for more details. So in summary, you are cropping these bounding images and saving them to disks instead of generating them on-the-fly. As for the detection framework, are you using YOLO?

cdicle commented 7 years ago

Oh, that is great @pengpaiSH. You are much more seasoned than me. I am training a single-class object detector not a multi-class. YOLO definitely is a strong option but I have followed a cascaded net approach as in http://users.eecs.northwestern.edu/~xsh835/assets/cvpr2015_cascnn.pdf.

I will let you know if I make any progress along writing the generator. I think it might be a good idea to follow caffe's convention for WindowData Layer https://github.com/BVLC/caffe/blob/master/src/caffe/layers/window_data_layer.cpp. It is not as automatic as one would expect, however it gives the flexibility to the user to apply their own sample generation policy such as hard negative mining.

Let me know your thoughts

pengpaiSH commented 7 years ago

@cdicle Hey, Cidicle. Any new progress recently : )

cdicle commented 7 years ago

Hey @pengpaiSH Well there is some but not directly for object detection. More in the line of classification. I wrote a json filelist iterator similar to ImageDataGenerator.flow_from_directory(). It makes many tasks easier like cross validation, ensemble training ...

This is not directly related to detection task but it might be a good starting point. I am planning to submit a pull request soon.

Sounds interesting?

pengpaiSH commented 7 years ago

@cdicle That's not just interesting! It's fantastic, amazing! You know, we need bind the bounding box (actually 4 numbers: height, width, x, y) with the original image in the progress of data augmentation.

NegatioN commented 7 years ago

If anyone needs a simple bounding-box image DirectoryIterator, this is a barebones implementation which requires your extra information to be sorted in the same way the filenames in the DirectoryIterator. https://github.com/NegatioN/KaggleFisheries/blob/master/bboxgenerator.py

It does not deal with image augmentation in any way, as we would have to convert all input data into matrices of the same size somehow to achieve that in a generalizable way. Maybe I'll look into that someday and see if it's possible to upstream.

pengpaiSH commented 7 years ago

@NegatioN Thank you NegationN, wow, you are attending the competition NCFM on Kaggle. By the way, did you find any improvements with object detection (regressing the bounding box) VS vanilla CNN (only for image classification) ?

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.

apacha commented 7 years ago

I've been trying to setup a scenario that trains classification and object localization (= 1 bounding-box) at the same time. To provide the data via an iterator in Keras, I've derived from the DirectoryIterator and extended it to return not only the class labels, but a second array that contains the bounding box. See my repository for the details.

I've provided a dictionary that is used to look up the bounding-box by the file-name and the model that I'm training has two outputs.

stale[bot] commented 7 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 30 days if no further activity occurs, but feel free to re-open a closed issue if needed.