Loading custom datasets

When loading labels and images using load_multiple_labels_from_csv and load_images_with_ids_from_directory errors can occur if the image directory contains png files other than those you expect to train with.

For example following the Leaf Counting Tutorial 1 and using the CVPPP dataset with:

model.load_multiple_labels_from_csv('./CVPPP/A1/A1.csv', id_column=0)
model.load_images_with_ids_from_directory('./CVPPP/A1')

The CVPPP dataset (derirved from the IPPN dataset) contains png files of each: plant (_rgb.png); the foreground mask (_fg.png); leaf instance masks (_mask.png) etc. The error occurs because all png files (line 2537, deepplantpheno.py) are passed to split_raw_data (line 721, deepplantpheno.py) and result in unexpected input shapes in the model.

The documentation for load_images_with_ids_from_directory 2 says: "If you have specified a list of files (for example, using the ID column in load_multiple_labels_from_csv()), then you can use this function to load those images from a directory." However it seems to load all png files.

I believe the functionality of load_images_with_ids_from_directory should be changed to reflect the description in the documentation and remove the dependency on a specific image type (png).

p2irc / deepplantphenomics

Loading custom datasets #14