Open diegozea opened 8 years ago
Could you be a bit more specific? Does sum(tab, dims)
or mean(tab, dims)
do what you need?
Yes, I'm doing that ;)
But I was wondering if freqtable(..., marginal=true)
could return a table like this one:
x1 | x2 | X | ||
---|---|---|---|---|
y1 | 1 | 2 | 3 | |
y2 | 3 | 2 | 5 | |
Y | 4 | 4 | 8 |
In R there's an addmargins
function. Given that it's even shorter than adding marginal=true
, that could be a better solution. Would you make a PR to add it?
I'm sorry. I haven't time right now for working on that PR :/
Then I'll try to have a look later. Shouldn't be hard.
Thanks! There is no hurry.
I have a similar usecase but rather for an equivalent of prop.table
in R.
You can do it now using tab ./ sum(tab, dims)
, but it is such a common operation that maybe it should be handled by the package. I can imagine two options:
How do you see it?
As I said, I'd rather go with a wrapper function. We could also imagine providing a function, say proptable
, which would call freqtable
and compute the proportions.
Hi, has this issue been closed without fixing the issue? How to add marginals?
Thanks, Nicolò
see the referenced PR https://github.com/nalimilan/FreqTables.jl/pull/19. You can use prop
function.
Thanks, I was using release 0.3.1 where it seems the keyword is not exported.
Btw, maybe I'm doing it wrong.
> table(dat$A, dat$B)
1 2 3 4
1 2 2 2 2
2 2 2 2 2
3 2 2 2 2
> addmargins(table(dat$A, dat$B))
1 2 3 4 Sum
1 2 2 2 2 8
2 2 2 2 2 8
3 2 2 2 2 8
Sum 6 6 6 6 24
addmargins(table(dat$A, dat$B), 1)
1 2 3 4
1 2 2 2 2
2 2 2 2 2
3 2 2 2 2
Sum 6 6 6 6
julia> freqtable(dat, :A, :B)
3×4 Named Array{Int64,2}
A ╲ B │ 1 2 3 4
──────┼───────────
1 │ 2 2 2 2
2 │ 2 2 2 2
3 │ 2 2 2 2
prop(freqtable(dat, :A, :B), margins = 1)
3×4 Named Array{Float64,2}
A ╲ B │ 1 2 3 4
──────┼───────────────────────
1 │ 0.25 0.25 0.25 0.25
2 │ 0.25 0.25 0.25 0.25
3 │ 0.25 0.25 0.25 0.25
prop(freqtable(dat, :A, :B), margins = (1,2))
3×4 Named Array{Float64,2}
A ╲ B │ 1 2 3 4
──────┼───────────────────
1 │ 1.0 1.0 1.0 1.0
2 │ 1.0 1.0 1.0 1.0
3 │ 1.0 1.0 1.0 1.0
Have a look at help of prop
, this is the way to use it:
julia> prop([1 2; 3 4], 1, 2)
2×2 Array{Float64,2}:
1.0 1.0
1.0 1.0
julia> prop([1 2; 3 4])
2×2 Array{Float64,2}:
0.1 0.2
0.3 0.4
julia> prop([1 2; 3 4], 1)
2×2 Array{Float64,2}:
0.333333 0.666667
0.428571 0.571429
julia> prop([1 2; 3 4], 2)
2×2 Array{Float64,2}:
0.25 0.333333
0.75 0.666667
julia> prop([1 2; 3 4], 1, 2)
2×2 Array{Float64,2}:
1.0 1.0
1.0 1.0
Thanks, but none of those is similar to what R's addmargins does (what's asked here)
I mean, return this:
x = freqtable(dat, :A, :B)
vcat(hcat(x, sum(x, dims = 2)), hcat(sum(x, dims = 1)..., sum(x)))
4×5 Named Array{Int64,2}
A ╲ hcat │ 1 2 3 4 5
─────────┼───────────────────
1 │ 2 2 2 2 8
2 │ 2 2 2 2 8
3 │ 2 2 2 2 8
4 │ 6 6 6 6 24
preserving names and so on
Ugly, but this:
function addmargins(tab)
x, y = names(tab)
x = string.(x)
y = string.(y)
push!(x, "Sum")
push!(y, "Sum")
res = vcat(hcat(tab, sum(tab, dims = 2)), hcat(sum(tab, dims = 1)..., sum(tab)))
setnames!(res, x, 1)
setnames!(res, y, 2)
res.dimnames = tab.dimnames
res
end
4×5 Named Array{Int64,2}
A ╲ B │ 1 2 3 4 Sum
─────────┼────────────────────────
1 │ 2 2 2 2 8
2 │ 2 2 2 2 8
3 │ 2 2 2 2 8
Sum │ 6 6 6 6 24
Ah - understood. I do not think it is supported.
Out of curiosity - in what situation would you need it (apart from the fact that R provides it)? I am asking because I never needed such functionality (and I use FreqTables.jl on daily basis) + it is in general unsafe, as if you change the contents of such table the margins get invalidated, so you loose consistency of your table.
In a report or a journal paper it's a nice way to present some data. In this specific case: I have an experiment with outliers. I want to show how many outliers are present for each condition, the sample size, the number of valid/invalid trials... I care about the proportion of valid/unvalid trials, but the raw numbers are more important (25% out of 4 or out of 10000 makes a big difference here).
Those tables sumarize it well:
7-8y old
#+call: outlier-frequency-by-age[:exports results](age="7-8y")
#+RESULTS:
| condoutlier \ cond | auditory | haptic | visual | crossmodal | Sum |
|--------------------+----------+--------+--------+------------+-----|
| false | 18 | 37 | 38 | 35 | 128 |
| true | 20 | 1 | 0 | 3 | 24 |
| Sum | 38 | 38 | 38 | 38 | 152 |
10-11y old
#+call: outlier-frequency-by-age[:exports results](age="10-11y")
#+RESULTS:
| condoutlier \ cond | auditory | haptic | visual | crossmodal | Sum |
|--------------------+----------+--------+--------+------------+-----|
| false | 33 | 46 | 46 | 46 | 171 |
| true | 13 | 0 | 0 | 0 | 13 |
| Sum | 46 | 46 | 46 | 46 | 184 |
adults
#+call: outlier-frequency-by-age[:exports results](age="adults")
#+RESULTS:
| condoutlier \ cond | auditory | haptic | visual | crossmodal | Sum |
|--------------------+----------+--------+--------+------------+-----|
| false | 15 | 16 | 16 | 16 | 63 |
| true | 1 | 0 | 0 | 0 | 1 |
| Sum | 16 | 16 | 16 | 16 | 64 |
(the syntax here is emac's org mode, julia's code that's called is:
addmargins(freqtable(data, :condoutlier, :cond, subset = data.agegroup .== age))
)
For the three age groups you see N
of subjects, n
of trials, n
of outliers by conditions... Quick and simple (even if in R is still even simplier, because you can call it with freqtable(data, :A :B, :C) and you get many tables, in the example above I have to run the function 3 times).
Also, maybe conversion to string can be replaced by something like Union{eltype(x),AbstractString}
?
I agree with this use-case, but I would rather create a custom display function for this (that could e.g. automatically also use MIME-type to output HTML, LaTeX etc.) so that you have a separate Model from View.
It might make sense, but you don't always want to display it with the marginals. So I don't know which is the best way to organize this. Any idea?
I agree something like addmargins
can be useful. It should also allow specifying specific margins to which totals must be added.
Something which is annoying in R is when you want to add margins to a table of proportions: addmargins(prop.table(table(...), 1))
gives correct row sums (equal to 1) but meaningless column sums (equal to sums of row proportions) and grand total (equal to 2). So maybe we should try to find a more convenient API? For example, instead of a function we could add a keyword argument to freqtable
and prop
. Or maybe introduce addmargins
, but also a keyword argument to prop
since that's where the problem arises (for raw counts addmargins
is OK).
Would be great to have the ability to show/calculate/store the marginal values of a table, when that is required.
Best,