Open zygmuntszpak opened 6 years ago
I think that I should simply allow the syntax your using (actually, I think it should have to be Date.Date(:DATETIME)
without the dot as it is a element-wise operation). JuliaDB supports using a selection in a groupby
function and you can use the @=>
macro to get the selection in JuliaDBMeta more easily:
julia> using JuliaDBMeta
julia> iris = loadtable(Pkg.dir("JuliaDBMeta", "test", "tables", "iris.csv"));
help?> @=>
@=>(expr...)
Create a selector based on expressions expr. Symbols are used to select columns and infer an
appropriate anonymous function. In this context, _ refers to the whole row. To use actual symbols,
escape them with ^, as in ^(:a). Use cols(c) to refer to field c where c is a variable that
evaluates to a symbol. c must be available in the scope where the macro is called.
Examples
==========
julia> t = table(@NT(a = [1,2,3], b = [4,5,6]));
julia> select(t, @=>(:a, :a + :b))
Table with 3 rows, 2 columns:
a a + b
────────
1 5
2 7
3 9
julia> select(iris, @=>(:Species == "setosa"))
150-element Array{Bool,1}:
true
true
true
true
true
true
true
true
true
true
⋮
false
false
false
false
false
false
false
false
false
julia> @groupby iris @=>(:Species=="setosa") {length = length(_)}
Table with 2 rows, 2 columns:
Species == "setosa" length
───────────────────────────
false 100
true 50
Note that this @=>
macro is not specific to JuliaBDMeta function but you can use it with normal JuliaDB:
julia> groupby(length, iris, @=>(:Species=="setosa"))
Table with 2 rows, 2 columns:
Species == "setosa" length
───────────────────────────
false 100
true 50
Thank you very much for the clarification, and for this great package. For the cursory reader, the following is a solution to my example.
@apply r begin
@groupby @=>(Dates.Date(:DATETIME)) {length = length(_)}
end
I would like to be able to group on the result of applying a function to one of the columns. For example, suppose that I have a column
:DATETIME
which stores the year/month/day/h/m/s. In some queries I might want to group on the DATE only, whereas in other queries I might want to group on the TIME.Hence, it would like to write something like this:
Is this type of operation currently supported, but I am just using the wrong syntax? As a workaround I could always add more columns using
@transform
to explicitly split the DATETIME into DATE and TIME, but I was wondering if there is another solution.