JuliaPlots / RecipesPipeline.jl

Utilities for processing recipes
http://juliaplots.org/RecipesPipeline.jl/dev/
MIT License
17 stars 17 forks source link

Warning: You created n=28657 groups... Is that intended #92

Closed mdkeehan closed 3 years ago

mdkeehan commented 3 years ago

I received this warning while plotting. My waiting for the plots I decided to try and investigate the code.

RecipesPipeline ~/.julia/packages/RecipesPipeline/BY2Dd/src/group.jl:14
┌ Warning: You created n=28657 groups... Is that intended?

I have 28657 groups for 99845 series. Think of little particles moving in a time stream... I know my requirement is a bit extreme...

Is line 16 in group.jl

group_indices = Vector{Int}[filter(i -> v[i] == glab, eachindex(v)) for glab in group_labels]

an O(G x S) operation i.e. something that will require 28657x99845 operations to complete? I wonder if it could be rewritten.

mdkeehan commented 3 years ago

I have done my homework and would like to contribute a rewritten loop.

using BenchmarkTools
#
# this is line 16 of groups.jl
#
function extract_group_attributes(v)
    group_labels = collect(unique(sort(v)))
    n = length(group_labels)
    if n > 100
        @warn("You created n=$n groups... Is that intended?")
    end
    group_indices = Vector{Int}[filter(i -> v[i] == glab, eachindex(v)) for glab in group_labels]
   # parts omitted...
end
#
# here is a redesigned loop.
#
function extract_group_attributes2(v)
    res = Dict{eltype(v),Vector{Int}}()
    for (i,label) in enumerate(v)
        if haskey(res,label)
            push!(res[label],i)
        else
            res[label] = [i]
        end
    end
    group_indices =  [ res[i] for i in sort(collect(keys(res)))]            
end

d1= [ "C","C","C","A", "A", "A","B","B","D"]
res1 = @benchmark extract_group_attributes(d1);
res2 = @benchmark extract_group_attributes2(d1);

d1= [ "xx"*"$(i%599)" for i in 1:10000]
res1 = @benchmark extract_group_attributes(d1);
res2 = @benchmark extract_group_attributes2(d1);
m1=median(res1)
m2=median(res2)
judge(m2,m1)