broadinstitute / keras-rcnn

Keras package for region-based convolutional neural networks (RCNNs)
Other
555 stars 222 forks source link

Loading external Data to train the model #180

Open fenilsuchak opened 6 years ago

fenilsuchak commented 6 years ago

There is no information regarding how to load external data apart from the available keras_rcnn.datasets. Suppose I have input_images , target_images , and np.array of bounding boxes for each image. How would I load it into the model? Probably should be added to the readme file too.

jhung0 commented 6 years ago

Yes, sorry we are behind on that. Take a look below and let me know if that helps.

External Data

The data is made up of a list of dictionaries corresponding to images.

Suppose this data is save in a file called training.json. To load data,

import json

with open('training.json') as f:
    d = json.load(f)
mave5 commented 6 years ago

I developed this code so you can load DSB 2018 data. https://github.com/mravendi/keras-rcnn/blob/master/keras_rcnn/datasets/dsb2018.py

simply use dicts=load_data(path2data,data_group)

where path2data is the location of DSB 2018 data on your computer data_group can be either "stage1_train" or "stage1_test"

it will return a list of dictionaries in the format that this repo likes.

vz415 commented 6 years ago

Thanks @mravendi ! Your code works with the dsb 2018 dataset on my local machine.

Just a warning for others working with this dataset, you'll most likely receive an error when running this because the generator queries the image file itself (e.g. asdlfkj129.png) and you'll need to modify the _get_batches_of_transformed_samples method within the DictionaryIterator class to ignore the alpha channel. Something like: image=image[:,:,:3] before all the transformations.

There's probably a better way of doing this, such as linking the color_mode argument with the number of channels that _get_batches_of_transformed_samples number of channels OR using the number of channels specified in the dictionary. However, I'm not a CV expert and thus not sure about the best design choice.