shashi / FileTrees.jl

Parallel computing with a tree of files metaphor
http://shashi.biz/FileTrees.jl
Other
88 stars 6 forks source link

merge reorders tree in alphabetical order #44

Closed DrChainsaw closed 3 years ago

DrChainsaw commented 3 years ago

Maybe this is intended as it does not seem to be stated anywhere that FileTrees shall have any particular order.

julia> tree = maketree("root" => ["b" => ["x"], "a" => ["y"]])
root/
├─ b/
│  └─ x
└─ a/
   └─ y

julia> merge(tree, tree)
root/
├─ a/
│  └─ y
└─ b/
   └─ x

It did surface as a minor inconvenience for me when plotting stuff in the tree after merging with mv as it is a bit hard to control the order in which things appear in the plot; Imagine that a and b have multiple files and I want to plot one series with the values under a and b concatenated.

shashi commented 3 years ago

The sorting is to help de-duplication... in general I did not think maintaining order would be important...

what would be a better way to deal with this?

shashi commented 3 years ago

I mean what would be a way to keep the original order while allowing fast de-duplication? Maybe we can maintain a vector of clidren and a set of their names too?

DrChainsaw commented 3 years ago

I haven't thought much about it since I wasn't sure if this was intended. Seems nicer if we don't reorganize the children, but maybe it will be a tall promise to make in the long run.

function _combine(cs, combine)
    if !issorted(cs, by=name)
        sort!(cs, by=name)
    end
    i = 0
    prev = nothing
    out = []
    for c in cs
        if prev == name(c)
            out[end] = apply_combine(combine, c, out[end])
        else
            push!(out, c)
        end
        prev = name(c)
    end
    map(identity, out)
end

Is this the place where it happens?

I think this can be rewriten this using a Dict instead of relying on sort! so that out retains the order of cs. Want me to give it a shot?