Closed bkamins closed 3 years ago
@jangorecki - also the current version of Julia is 1.5.3 (the benchmarks have 1.5.0) - it is nothing major but this also probably can be changed (I would do it myself but I remember that there were several places where this had to be changed and I was afraid that I would forget something).
Thanks. Is it just style or there is something more about it? Having one liner is easier and more clear to present syntax on benchmark plot. We have a double liner example for python datatable join where first a key to be set on a table before joining: https://h2oai.github.io/db-benchmark/join/J1_1e7_NA_0_0_basic.png whenever possible I think it is better to use one liners.
The issue is that you are executing statements in top-level code (global scope). Normally you write performance sensitive code in Julia within a functions (this affects performance a bit as it is related to how compilation is handled).
I have updated the proposal to use "one liner syntax" as originally, but wrapped in the function. Would this work for you (I am not sure how you are generating the labels for the plots).
@nalimilan - could you just double-check please if I have not made a typo? Thank you!
I would prefer to have this function created inside @elapsed
block, but then the whole change would be probably pointless.
t = @elapsed(range_fun(x) = combine(groupby(x, :id3), [:v1, :v2] => ((v1, v2) -> maximum(v1)-minimum(v2))) => :range_v1_v2); ANS = range_fun(x)); println(size(ANS)); flush(stdout);
Yes - the point is not to compile the same thing twice (which happens in global scope). Anyway - if it is not feasible then just please close this PR. This is not something major.
Also - DataFrames.jl 0.22.0 release is out but I assume package versions are upgraded automatically - right?
Yes, julia is upgrading automatically. For current moment cron job to run benchmark is disabled, once it will be enabled julia should get upgraded automatically on the next run.
In Julia a recommended style is to define a function that is used for aggregation. I hope such a change is OK with the standards these benchmarks want to follow.