Closed abieler closed 7 years ago
Hi! Really cool that you are doing this. I agree that we should allow for the obsdim to be specified. In fact we already have a "system" for doing so which we use at MLDataPatterns.jl (that package will be the new back-end for MLDataUtils for all data subsetting, k-folds etc).
Would be cool if you could adapt the code to that "system". I describe the general way of doing it here: https://github.com/joshday/OnlineStats.jl/issues/40#issuecomment-290209593 . For most code we allow any order array, but it would already be a big improvement to just have code for vectors an matrices
edit: ObsDim
is defined in LearnBase.jl here: https://github.com/JuliaML/LearnBase.jl/blob/master/src/LearnBase.jl#L318-L395
Cool. I ll definitely try to adapt to that scheme!
somewhat like this?
Thanks for the comments, I ll add some tests later on
This is optional, but if you feel up to it, it would be cool to update the corresponding section in the documentation: https://raw.githubusercontent.com/JuliaML/MLDataUtils.jl/master/docs/data/feature.rst
lgtm. Ready to merge when you are
thanks!
thanks for all the comments! also learned about singleton types in the process. :) how do you feel about support for dataframes and datatables? worth looking at or do you want to keep this for arrays only?
how do you feel about support for dataframes and datatables
I will add a DataFrames
dependency in the next update (see dev
branch), so I am open to the idea.
It might be useful to also have the feature rescaling functions working on both dimensions. Currently the tests are failing, but I figured to check with you first if you like the general idea.