User worried about optimal performance will not be able to get it with a column-oriented tabular format in any event, as MultivariateStats likes observations as columns (and by definition, observations in tables are rows). Assuming the user starts with a matrix, X, they can always avoid a copy under the current proposal by wrapping Tables.table(X'), assuming columns are observations. If rows are observations, there is no way to get efficiency (even using MultivariateStats directly) and the user can manually permute the dimensions before wrapping the adjoint of the result for use by the MLJ wrapped model. (If memory is not an issue, I presume this is generally faster than just wrapping X directly.)
User worried about optimal performance will not be able to get it with a column-oriented tabular format in any event, as MultivariateStats likes observations as columns (and by definition, observations in tables are rows). Assuming the user starts with a matrix,
X
, they can always avoid a copy under the current proposal by wrappingTables.table(X')
, assuming columns are observations. If rows are observations, there is no way to get efficiency (even using MultivariateStats directly) and the user can manually permute the dimensions before wrapping the adjoint of the result for use by the MLJ wrapped model. (If memory is not an issue, I presume this is generally faster than just wrappingX
directly.)See also this note in the manual.
TODO:
(needs https://github.com/JuliaStats/MultivariateStats.jl/issues/192)