Open arnaudmgh opened 5 years ago
This is definitely not specific to FreqTables (implementing the method there would be type piracy), so it should either use Tables.jl or a special DataFrames constructor. Tables.jl doesn't support arrays, so that leaves DataFrames.
Though there's some tension with the way AbstractMatrix
behaves: DataFrame(::AbstractMatrix)
gives a data frame with the same dimensions as the input. Yet DataFrame(::NamedMatrix)
would have an additional column giving the row names. That means NamedArray
wouldn't completely work like other AbstractArray
objects. A solution would be to have a keyword argument to add row names, which would be off by default.
Another consideration is that a different conversion rule can be considered for higher-dimensional NamedArray
objects: have one column per dimension and one row per cell. This is how it works for example in R if you call as.data.frame
on a table
object (but not on an R named array). This is useful in particular for frequency tables. Maybe we can find a different solution for that, though (something like stack
.
Thank for the explanations and the good points @nalimilan. I agree the transformation of higher dimensional arrays performed by R's as.data.frame
looks very much like a stack
operation.
So, indeed there is some tension between the intuitive 2 dimensional solution and the higher dimension tables - the function I wrote above would ignore higher dimensions.
One possibility would be to stack by default, even 2d arrays. A user can always unstack if necessary.
This should be solved with #99 for arbitrary dimensions.
I ran into a problem when writing the result of
freqtable
to a CSV file: I converted to DataFrame and lost all the names.The solution I came up with was to overwrite CSV.write follows:
I'd be willing to help, submit a PR or else, depending on what suggestions.
Please let me know what would help and make sense. Thanks!