Closed bdzyubak closed 2 years ago
A prep_training_data class has been implemented which supports segmentation tasks. Data is expected to be in ,/data/images and ,/data/masks. Batch sizes, prefetching, rescaling are defined as defaults and do not need to be specified. Over time, more set methods will be implemented to toggle additional functions. 7e8c70bd0bfb52bc88ffcf33e66f06bd0a83f88f
The data import can now be specified in as little as:
dataset = ImgMaskDataset(os.path.join(top_path,'data'))
dataset.prep_data_img_labels()
Set methods for prefetching batch size and image rescaling are available, and many more will be added later.
The class now also supports classification type data with image data and labels placed like this: /data/images /data/excel_with_labels.csv
Extensions to other data formats and data preprocessing will be implemented in future enhancements. 6fd7f4f8b50ad9a5038a93119a84b60cecb2017b
There are several standard formats for input data. For example, segmentation tasks often have a directory of images and a directory of masks with the same number of files. Classification tasks may have a directory of images (or other data), and a csv/text file with labels. For ease of use and maintainability, a class should be implemented capable of reading/preprocess/augmenting input datasets. This is much easier to maintain than copying training scripts and modifying for each problem, and allows more advanced preprocessing methods to be used widely.