The skimage package for calculating glcm texture properties in the deployment pipeline was taking a long time to process per tile, so this branch explored alternative python packages and methods for implementation in order to optimize the functions for calculating texture.
What is the new behavior?
Reworking of training and deployment pipeline to support use of fast_glcm
Comparative analyses to explore differences in texture properties using both methods (see notebooks/texture_analysis.ipynb)
Optimization and clean up of slow_glcm.py script.
New and revamped noise removal and post processing methods.
Creation of transfer learning pipeline transfer_learning.py that pulls analysis ready data from s3 rather than performing preprocessing steps on raw data.
Additional functions to download ARD and feats from s3.
Comparative analyses of the preprocessing pipeline and ARD (see notebooks/exploratory_data_analysis.ipynb)
Reorganization of repository folders including renaming files and removal of old files.
New methods for dropping plot ids without s2 imagery from training
Clean up of validation checks to support addition of texture features and ARD.
What is the current behavior?
The skimage package for calculating glcm texture properties in the deployment pipeline was taking a long time to process per tile, so this branch explored alternative python packages and methods for implementation in order to optimize the functions for calculating texture.
What is the new behavior?
notebooks/texture_analysis.ipynb
)slow_glcm.py
script.transfer_learning.py
that pulls analysis ready data from s3 rather than performing preprocessing steps on raw data.notebooks/exploratory_data_analysis.ipynb
)Does this introduce a breaking change?