In the process of semi-random sample formation from file iterators, it happens that one may select a file unsuitable for learning. There are currently ways of filtering files on metadata and file name, but if all or some of the sample_pipeline has to be run in order to determine whether to skip the sample, then we have no system of dealing with this.
[ ] We should provide a way in run_sample_pipeline to end the sample pipeline early on a statistical filter (such removing an image of all-night conditions with a zero-variance filter).
[ ] Also a fit in fit.py should be able to return an unfitted model and info about why fitting could not happen.
[ ] In an evolutionary search, if an invalid sample / fit operation occurs, then the objective function should return a Inf (minimization objective) or -Inf (maximization objective). If this type of functionality is needed, then use scikit-learnmake_scorer on a custom function to return Inf/-Inf as needed.
[ ] In other cases, see the dask-searchcv pattern of raising and handling fitting exceptions.
In the process of semi-random sample formation from file iterators, it happens that one may select a file unsuitable for learning. There are currently ways of filtering files on metadata and file name, but if all or some of the
sample_pipeline
has to be run in order to determine whether to skip the sample, then we have no system of dealing with this.run_sample_pipeline
to end the sample pipeline early on a statistical filter (such removing an image of all-night conditions with a zero-variance filter).fit
infit.py
should be able to return an unfitted model and info about why fitting could not happen.