p2irc / deepplantphenomics

Deep learning for plant phenotyping.
GNU General Public License v2.0
133 stars 46 forks source link

Loading custom datasets #14

Closed DanielCWard closed 5 years ago

DanielCWard commented 5 years ago

When loading labels and images using load_multiple_labels_from_csv and load_images_with_ids_from_directory errors can occur if the image directory contains png files other than those you expect to train with.

For example following the Leaf Counting Tutorial 1 and using the CVPPP dataset with:

model.load_multiple_labels_from_csv('./CVPPP/A1/A1.csv', id_column=0)
model.load_images_with_ids_from_directory('./CVPPP/A1')

The CVPPP dataset (derirved from the IPPN dataset) contains png files of each: plant (_rgb.png); the foreground mask (_fg.png); leaf instance masks (_mask.png) etc. The error occurs because all png files (line 2537, deepplantpheno.py) are passed to split_raw_data (line 721, deepplantpheno.py) and result in unexpected input shapes in the model.

The documentation for load_images_with_ids_from_directory 2 says: "If you have specified a list of files (for example, using the ID column in load_multiple_labels_from_csv()), then you can use this function to load those images from a directory." However it seems to load all png files.

I believe the functionality of load_images_with_ids_from_directory should be changed to reflect the description in the documentation and remove the dependency on a specific image type (png).

jubbens commented 5 years ago

addressed in 2f5346c12e0bdd66699f31cbb59d6178fc6e53ce thanks Daniel and Donovan!