JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.46k stars 5.46k forks source link

alist-like construct #8674

Closed tpapp closed 6 years ago

tpapp commented 9 years ago

With pull request #8521 my understanding is that the => has become the syntax for pairs (or similar) and has been decoupled from Dict per se, so I am wondering if there is going to be support for lightweight associative lists, eg of the kind

(:a => 1, :b => 2)

I guess I do not not need to argue very much that this is useful since the construct is present in all LISP languages (eg), but for me the most compelling argument is when I want to return multiple values but don't want to remember/hardcode the order of values, nor define a new type, and want something lighter than Dict (AFAIK hash tables of a few elements are not optimal from a performance point of view in most languages, but maybe Julia is different).

An skeleton of an example could be

function my_iterative_algorithm(start, tolerance, other_arguments...)
  ## algorithm goes here
  (:root => root, :converged_p => convergence, :number_of_iterations => n, :precision => prec)
end

solution = my_iterative_algorithm(start, tol)

if solution[:converged_p]
  do_something(solution[:root])
else
  try_something_else()
end

An advantage of Julia over LISP is that tuples are a separate type, so [] could dispatch on that and there is no need for a separate accessor function.

(Sorry for the noise if this is already planned, I searched the archives and already asked a related question on julia-users but I was not aware of this new feature then.)

iamed2 commented 9 years ago

This seems similar to namedtuples in Python (in practice, not in implementation). In Python, they create a class with properties, but some sort of anonymous type seems like a treacherous road to go down. Perhaps the pair-tuple syntax could create a type that can be indexed numerically like a tuple but also accessed by field (overloading getfield). Could be implemented as an immutable type containing a tuple of values and a parallel tuple of symbols that can be searched.

Illustration:

function my_iterative_algorithm(start, tolerance, other_arguments...)
  ## algorithm goes here
  (:root => root, :converged_p => convergence, :number_of_iterations => n, :precision => prec)
end

solution = my_iterative_algorithm(start, tol)
@assert solution.converged_p == solution[2]

if solution.converged_p:
    do_something(solution.root)
else
    try_something_else()
end

As the other thread says, though, this seems like a job for composite types.

quinnj commented 9 years ago

How about just using

function getindex{T}(t::(Pair{Symbol,T}...), key::Symbol)
  for pair in t
    pair.first == key && return pair.second
  end
end

? Simple, quick (for small tuples), easy. This seems like something you could easily add to whatever package you're working on if you'd like to interface with tuples this way, but I'm not sure of the real utility in Base.

In  [2]: t = (:a=>1,:b=>2)

Out [2]: (:a=>1,:b=>2)

In  [3]: t[:a]

Out [3]: 1
tpapp commented 9 years ago

@iamed2: I agree that mature code should use composite types, but for quick prototyping I find them inflexible.

@quinnj: Neat, thanks. The rationale for including this in Base, conditional on this being the right idiom, would be to standardize style. But as a Julia newbie I don't have enough experience to say if alists/named tuples are really needed, or if I am simply trying to Write Lisp in Any Language.

nalimilan commented 9 years ago

This could be considered as part of a larger body of features offered by https://github.com/davidavdav/NamedArrays (or whatever they end up being called): arrays and tuples with names. I think this is a pretty basic feature, but it does not necessarily need to live in Base (maybe in a future set of basic packages).

vchuravy commented 9 years ago

Also tuples of Pairs are faster than dict for small n (< 20) at least on 0.3.1

Benchmark: https://gist.github.com/vchuravy/890a162ae3201cebf19d

simonster commented 9 years ago

Here's another approach that requires keys to be known at compile time, but can avoid runtime key lookup and gives type information even for alists with heterogeneous values. Unfortunately for optimal performance this approach requires use of the @aidx macro, at least until we can rerun type inference after inlining.

JeffBezanson commented 9 years ago

see also #4916

stevengj commented 9 years ago

As @JeffBezanson mentions, we've discussed a related need for a very lightweight Associative{Symbol,V} type that just uses linear search instead of a hash table. This could be used for keyword arguments (replacing the Array of pairs that is currently passed), and would also be useful for MIME parameters as mentioned in #7959.

jakebolewski commented 9 years ago

If the keys are known at compile time, wouldn't something like perfect hashing be a better alternative?

simonster commented 9 years ago

If the keys for both construction and lookup are known at compile time, you can just store the values in an array or tuple and map keys to indices at compile time (which is what my gist does). Minimal perfect hashing only seems useful if you have compile-time knowledge of the keys for construction but not lookup. Both approaches have the potential disadvantage of needing code specialized on different sets of keys.

PythonNut commented 9 years ago

How would this be different than, say collections.OrderedDict in Python? I take it the proposal does not allow random access?

I really like the way the Python OrderedDict does comparisons (ordered when compared to other ordered Dicts, but unordered when compared to a regular Dict).

kmsquire commented 9 years ago

@PythonNut, for dictionaries with only a few elements (say, less than 20), something like the ideas presented here should be more efficient than creating a hash table.

DataStructures.jl has an OrderedDict type. Unless I missed something, It doesn't do comparisons the way you're describing, but a pull request would interesting if you're up to it.

tpapp commented 9 years ago

Now that @JeffBezanson mentioned #4916, the parallels with LISP are even more apparent. In CL it is a common idiom to return a plist, which can be used with cl:apply for functions that take a &key argument. The applications of the idiom are optimized very aggressively in most Lisp implementations.

My experience with language/compiler design is nonexistent, but I think that Julia may need something like that. Eg there could be an abstract type KeyValuePairs, of which Dict is a concrete subtype, and so is something else that we could call Alist for now (even though the name makes little sense in Julia, maybe AssocTuple?). Dict has fast access, fast insertion/deletion, but it has a large overhead (comparatively), so these advantages kick in after 10-20 key/value pairs. Alist would have slow insertion/deletion, and possibly O(N) lookup, but as N is small this would not matter since the constant cost would be low. The compiler could optimize this very aggressively, along the lines suggested by @simonster.

Both would be interchangeable in terms of interface, except for performance, depending on the size. Alist would be the recommended structure for splatting, as in #4916. Furthermore, especially if this helps with performance, keys of Alist could be restricted to be symbols, or be immutable.

stevengj commented 9 years ago

@tpapp, Julia already has an Associative abstract type of which Dict is a concrete subtype. As I mentioned above, what is being discussed is precisely a new Associative subtype that is optimized for small tables.

ararslan commented 6 years ago

This seems to have been addressed by the addition of NamedTuples. These preserve order, and key-value pairs can be obtained from them using pairs(::NamedTuple), which returns an AbstractDict. I'm going to close this as it seems no longer relevant, but feel free to reopen if there's something here I've overlooked.