An output of learning is a rawcovariates.csv file that was originally intended to show untransformed values for each covariate at each position. It now contains multiple extra fields, such as target values, prediction values from cross-validation, user defined fields etc.
It's overloaded and in a bad state at the moment because it gets written initially when covariate/target intersection occurs and then opened and written to again after cross-validation is performed. It might be a better idea to break up this file and write multiple files instead - or maybe carry a Pandas DataFrame throughout the workflow adding results to it and outputting it as one big results table.
If embarking on this be aware that the a lot of diagnostics.py functions (plotting) read this file and rely on the column ordering.
An output of learning is a
rawcovariates.csv
file that was originally intended to show untransformed values for each covariate at each position. It now contains multiple extra fields, such as target values, prediction values from cross-validation, user defined fields etc.It's overloaded and in a bad state at the moment because it gets written initially when covariate/target intersection occurs and then opened and written to again after cross-validation is performed. It might be a better idea to break up this file and write multiple files instead - or maybe carry a Pandas DataFrame throughout the workflow adding results to it and outputting it as one big results table.
If embarking on this be aware that the a lot of
diagnostics.py
functions (plotting) read this file and rely on the column ordering.