Add hashmap, hashset, treemap, and treeset macros

rust-highfive commented 9 years ago

Issue by gsingh93 Saturday Jun 07, 2014 at 16:49 GMT

For earlier discussion, see https://github.com/rust-lang/rust/issues/14726

This issue was labelled with: A-syntaxext in the Rust repository

I wanted to create an issue first asking about this before submitting a pull request.

Can I go ahead an implement hashmap!(), hashset!(), treemap!(), and treeset!() macros for constructing those collections with the given arguments? The syntax would be:

let h = hashmap!("foo" => "bar");
let s = hashset!("foo");

I already have these macros implemented in my own projects, so I'd just have to add them to macros.rs.

If I can add these, is there a process for testing macros? Or would I just replace all occurrences of hash{map,set} and tree{map,set} creation in the tests by the macros?

diwic commented 9 years ago

For my own part, My preference is for : above => - colons are used in struct initialisations, arrows in pattern matching. And we're initializing something similar to a struct (somewhat over-simplified, a map is a struct with dynamic keys), we're not pattern matching it. So colon are a better fit with the rest of the language IMO.

Also, I would vote for just getting something ergonomic in for 1.0 and think about optimisations later. So given this syntax:

    hashmap!("foo": 5, "bar": 8)

Just something that extracts to

    {
        a = Hashmap::new();
        a.insert("foo", 5);
        a.insert("bar", 8);
        a
    }

...would be fine.

m13253 commented 9 years ago

My preference is : above =>, and use {} instead of ().

For example:

hashmap! {"foo": 42, "bar": 64}

This is similar to JSON and Python, and has the same "theme" as vec!, which uses [].

The JSON similarity makes it easier to directly copy a JSON snippet into a Rust code.

And : is also used in structure initialization. => is something used in match and emphasizes the "result" of a pattern.

Gankra commented 9 years ago

I believe { is only supposed to be used for macros that define top-level items like structs, functions, etc.

blaenk commented 9 years ago

Yeah we should really use :.

aspcartman commented 9 years ago

So is there any movement?

reem commented 9 years ago

@aspcartman Not really - a more general abstraction is needed for std, but you could easily land these in a cargo crate.

sbeckeriv commented 9 years ago

Very sorry. I am new to rust. I wanted to create a HashSet with some default values. I found the vec->to_iter->collect pattern and thought it could be nicer. Then i found this thread.

Are macros favored over functions? I would love to see a function on structures like new that took a default object to convert from. Is there something like this?

http://is.gd/I0xsy5

Since you cant overload a method maybe it can use trait bounds? And each struct in the std can define its conversion for each other. If someone wants to have a new conversion they define it for their object?

bluss commented 9 years ago

Macros are never favoured, but that's just the fact if you can solve it with a function, you should. Macros may be the only way to provide this functionality with the features we want, for example avoiding any intermediate extra allocation.

sbeckeriv commented 9 years ago

Thanks @bluss . I now understand that using a function with trait bounds requires a second object to convert from where macros do not. I dont understand macros well enough. At some point they turn in to real code. Can I view what that code is doing?

In this example http://rustbyexample.com/macros/repeat.html how does $y store all the rest of the arguments? Can I write a function that does the same thing? I haven't found an example or documentation for it.

thanks

bluss commented 9 years ago

Sure, you can use rustc -Z unstable-options --pretty expanded SOURCEFILE where SOURCEFILE is one of your files, it will expand uses of macros into code.

$y in that example is inside $( ... ),+ which means it's a repeated variable (separated by comma). Macros use this to support arbitrary many arguments, something rust functions don't.

bluss commented 9 years ago

I published a crate maplit on crates.io with simple type-specific hashmap!{ } and similar macros so that they are finally available.

The : separator is not available to regular macro_rules! macros after an expr, so maplit uses => as key-value separator.

Prior art in this business include generic container literals in literator and generic container literals in grabbag_macros and an older crate with generic container literals called construct.

The generic solutions all have some drawbacks, so I think it's nice to have the type-specific macros out there as well.

BlacklightShining commented 9 years ago

Personally, I would use brackets (since braces are apparently reserved for other things, as Gankro said). Swift uses brackets for array and mapping literals, and I'm sure it's not the only one.

Also, I wonder if this could be generalized to support all Vec-like and mapping types, rather than having separate macros for each one. Something like let hash_map: HashMap<_> = mapping![key1: value1, key2: value2, ...];…maybe even adding syntactic sugar so we don't need a macro?

nagisa commented 9 years ago

Personally, I would use brackets

This is something the user, not implementor, decides. For example all these work and are equivalent:

    let a = vec!{1, 3, 5};
    let b = vec![1, 3, 5];
    let c = vec!(1, 3, 5);

ticki commented 9 years ago

@nagisa, right but there should be a convention.

JelteF commented 7 years ago

Is there any movement on this? Because it sounds really nice to have. I agree with the using the : syntax.

sunjay commented 7 years ago

@jeltef I believe the recommendation is to use the maplit crate.

gsingh93 commented 7 years ago

It would be nice if this was in the standard library. Could we get maplit added to the Rust nursery?

timbess commented 6 years ago

@JelteF Unfortunately : syntax isn't possible. They were talking about it above.

+1 I believe this should be in the std library as well.

Arignir commented 6 years ago

Hi,

I believe this should be in the std library for a couple of reasons:

Homogeneity with vec!: It's weird to have only one macro to initialize a Vec and nothing for other containers.
Reducing boilerplate: it's always a few lines saved when creating a new project.
Security: Reducing the number of 'unofficial' crates (i consider crates in the nursery official) reduces the probability that someone, one day, adds malicious code within his crate and impacts all the ones that have it as a dependency. Of course, this problem is unfixable, but it can be mitigated by merging the most common crates in std or in the nursery.

(You can see this story for more informations. It's a fake, but the point still stands. https://hackernoon.com/im-harvesting-credit-card-numbers-and-passwords-from-your-site-here-s-how-9a8cb347c5b5 )

stevenroose commented 6 years ago

If there are different types of maps, map!{} would still be possible as the type could be inferred with a (existing or new?) trait that gives ::new() and ::insert().

So that it becomes

let map: HashMap = map!{"key" => "value"};

Centril commented 6 years ago

@stevenroose You can always try it out with different operations on the playground... One less great aspect of doing it that way is that you are sort-of hoping that the right operations exist on the map type... If we had some way of being polymorphic over maps in the type system (we need to be able to quantify over traits, i.e. fn foo<trait Bar>(...) and then have associated traits...), then it would work better I think.

burdges commented 6 years ago

I'd suggest map!{"key1" => "value1", ...} being sugar for [("key1", "value1"), ...].iter().collect() and set!{"key1", ...} being sugar for ["key1", ...].iter().collect() because those are basically optimal for any probabilistic data structure.. or else simple show that .iter().collect() is idiomatic under Iter::collect() and HashMap. I think treemap!{...} etc should be reserved for if you can build the tree at compile time via dependent types or whatever.

JohnBSmith commented 5 years ago

What do you think of the following approach? It seems quite pleasant to me.

Recall that Into<T> for T is implemented, i.e. Into<T> is implemented for all types T.

#![allow(unused_macros)]
#![allow(unused_imports)]

use std::collections::HashMap;
use std::collections::BTreeMap;

trait MapLiteral<K,V> {
    fn new() -> Self;
    fn with_capacity(n: usize) -> Self;
    fn insert(m: &mut Self, k: K, v: V);
}

impl<K,V> MapLiteral<K,V> for HashMap<K,V>
where K: std::cmp::Eq+std::hash::Hash
{
    fn new() -> Self {HashMap::new()}
    fn with_capacity(n: usize) -> Self {HashMap::with_capacity(n)}
    fn insert(m: &mut Self, k: K, v: V){m.insert(k,v);}
}

impl<K,V> MapLiteral<K,V> for BTreeMap<K,V>
where K: std::cmp::Ord
{
    fn new() -> Self {BTreeMap::new()}
    fn with_capacity(_n: usize) -> Self {BTreeMap::new()}
    fn insert(m: &mut Self, k: K, v: V){m.insert(k,v);}
}

// replace ,* by ,* $(,)?
macro_rules! map {
    ($( $key:tt: $value:expr ),*) => {{
        let mut _temp_map = MapLiteral::new();
        $(MapLiteral::insert(&mut _temp_map,$key.into(),$value.into());)*
        _temp_map
    }};
    ({$init:expr} {$( $key:tt: $value:expr ),*}) => {{
        let mut _temp_map = $init;
        $(MapLiteral::insert(&mut _temp_map, $key.into(),$value.into());)*
        _temp_map
    }};
    ({$init:expr; $tk:expr, $tv:expr} {$( $key:tt: $value:expr ),*}) => {{
        let mut _temp_map = $init;
        $(MapLiteral::insert(&mut _temp_map,$tk($key),$tv($value));)*
        _temp_map
    }};
    ({$tk:expr, $tv:expr} {$( $key:tt: $value:expr ),*}) => {{
        let mut _temp_map = MapLiteral::new();
        $(MapLiteral::insert(&mut _temp_map,$tk($key),$tv($value));)*
        _temp_map
    }};
}

Let's try it out.

fn main() {
    let m: HashMap<i32,i32> = map!{};
    println!("{:?}",m);

    let m: HashMap<i32,i32> = map!{1: 2, 2: 1};
    println!("{:?}",m);

    let m: HashMap<Box<str>,i32> = map!{"x": 1, "y": 2};
    println!("{:?}",m);

    let m: HashMap<&str,i32> = map!{"x": 1, "y": 2};
    println!("{:?}",m);

    let m: HashMap<String,i32> = map!{"x": 1, "y": 2};
    println!("{:?}",m);

    let m: HashMap<Box<str>,i32> = map!{"x": 1, "y": 2};
    println!("{:?}",m);

    let m: HashMap<i32,Option<i32>> = map!{1: 2, 2: None};
    println!("{:?}",m);

    let m: HashMap<(i32,i32),i32> = map!{(1,1): 1, (1,2): 2};
    println!("{:?}",m);

    let m: HashMap<[i32;2],i32> = map!{[1,1]: 1, [1,2]: 2};
    println!("{:?}",m);

    let m: HashMap<Vec<i32>,i32> = map!{
        (vec![1,2]): 1,
        (vec![1,2,3]): 2
    };
    println!("{:?}",m);

    let m: BTreeMap<String,HashMap<i32,i32>> = map!{
        "x": HashMap::from(map!{1: 1, 2: 2}),
        "y": HashMap::from(map!{1: 2, 2: 1})
    };
    println!("{:?}",m);

    let m = map!{{HashMap::<i32,i32>::with_capacity(100)}{
        1: 2, 2: 1
    }};
    println!("{:?}",m);

    let m: HashMap<Box<String>,i32> = map!{{
        |x| Box::new(String::from(x)), |x| x
    }{
        "x": 1, "y": 2
    }};
    println!("{:?}",m);

    let m: HashMap<String,Box<[i32]>> = map!{{
        String::from, |x| Box::new(x) as Box<[i32]>
    }{
        "x": [1,1], "y": [1,2]
    }};
    println!("{:?}",m);

    let m: HashMap<String,Box<dyn std::any::Any>> = map!{{
        String::from, |x| Box::new(x) as Box<dyn std::any::Any>
    }{
        "x": 1, "y": "a", "z": [1,2]
    }};
    println!("{:?}",m);
}

burdges commented 5 years ago

How is that any better than map!{ a => b, ... } being sugar for [(a,b), ...].iter().collect()? In fact, I'd expect the MapLitteral design always creates unrolled loops, which bloats the code for zero performance improvement.

RustyYato commented 5 years ago

Well, the iter solution must clone the elements because it yields references, where unrolling the loop doesn't have to.

burdges commented 5 years ago

I see. In that case, maybe map! should avoid the clone because doing that lacks any convenient solution, but.. Anyone who cares about performance should still write out [(a,b), ...].iter().cloned().collect() manually. Rust's std should probably favor the performant solution though, which suggests map! should be provided by an external crate, not std itself.

RustyYato commented 5 years ago

But clones could be more expensive to do and not all types support clones, so that doesnt work. We could use ArrayVec, but I don't see how that's more efficient than unrolling the loop.

proninyaroslav commented 4 years ago

I think that macros for hash collections should definitely be in std. insert() reminds me of the times of Java when you have no way to initialize a map without resorting to the put() method, and in Kotlin lang, finally they made hashMapOf() and mapOf() functions, cheers! [(a, b), ...].iter().cloned().collect() seems to me a tricky and not very beautiful trick. Similar tricks are used in languages without syntactic sugar for map, and which don't have macros. This trick should not be the only option in a modern language with macros, like Rust. It makes me sad.

lightclient commented 4 years ago

What exactly is blocking progress on this? Are people still conflicted as to whether this should be in the standard library? I think consistency with vec! is a good aspiration and we should determine a performant way of achieving that.

varkor commented 4 years ago

The only thing blocking this issue is someone creating an RFC. As far as I'm concerned, maplit already has appropriate definitions and could be pulled into the standard library, although there are some outstanding issues that might be worth resolving first.

kennytm commented 4 years ago

Since it's a simple library addition, one could just file a rust-lang/rust PR directly.

burdges commented 4 years ago

We've discussed doing this with either collect or insert ala maplit, which permit mutability and seed from RandomState, but incurs some performance hit over compile time construction. It's worth mentioning however that one could do "true literal" variants that built the data structure at compile time:

In principle, btreemap! could return a &'static BTreeMap in which the internal allocations point into the data segment, but you cannot free them due the &'static.

If otoh you attempt this with HashMap then HashMap: Clone clones the RandomState instance, which enables DDoS attacks. We've discussed that HashMap: Clone represents a larger DDoS risk before, not just in this context. It's clear HashMap needs some fast_clone method that ignores DDoS resistance. As .iter().cloned.collect() provides the secure variant already, I think the Clone trait remains a valid choice for this fast_clone method, but only if we add some lint against using HashMap: Clone. This is not the only place were one needs a trait impl but also needs to lint against using it. Anyways, I suppose pub struct ConstHashMap<K,V>(HashMap<K,V,CompileTimeRandomState>)); could provide similar inherent methods, but implement clone differently.

The above approaches are not really optimal because they do not require allocation, but depend upon alloc for the HashMap or BTreeMap types. Instead I think an optimal solution resembles SmallVec, meaning:

SmallBTreeMap is mostly BTreeMap but does allocations like SmallVec and supports const operation.
SmallHashMap is mostly HashMap but does allocations like SmallVec and supports const operation, provided you use some CompileTimeRandomState for a const BuildHasher.

Ain't clear core needs Small* types of course, but seemed worth mentioning.

varkor commented 4 years ago

Since it's a simple library addition, one could just file a rust-lang/rust PR directly.

I think there may be enough design decisions open for discussion to warrant an RFC, but one could always submit a pull request and see what the libs team thinks.

LeSeulArtichaut commented 4 years ago

If you need someone to write an RFC, I'm up! As those macros are mostly for convenience and have been awaited for quite a long time, I propose that we simply implement four macros, hashmap!, hashset!, treemap!, and treeset!, which expand in what we would see in code today (manually creating and filling the structure) as opposed to a generic solution like the seq! macro proposed in #207. If the solution above fills the expectations of this issue, I can start working on the writing of the RFC. Please tell me if I should @varkor

shepmaster commented 4 years ago

It's possible to write a version that's applicable to multiple types of collections using the by-value iterator for arrays:

macro_rules! collection {
    // map-like
    ($($k:expr => $v:expr),* $(,)?) => {{
        use std::iter::{Iterator, IntoIterator};
        Iterator::collect(IntoIterator::into_iter([$(($k, $v),)*]))
    }};
    // set-like
    ($($v:expr),* $(,)?) => {{
        use std::iter::{Iterator, IntoIterator};
        Iterator::collect(IntoIterator::into_iter([$($v,)*]))
    }};
}

use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet};

fn main() {
    let s: Vec<_> = collection![1, 2, 3];
    println!("{:?}", s);
    let s: BTreeSet<_> = collection! { 1, 2, 3 };
    println!("{:?}", s);
    let s: HashSet<_> = collection! { 1, 2, 3 };
    println!("{:?}", s);

    let s: BTreeMap<_, _> = collection! { 1 => 2, 3 => 4 };
    println!("{:?}", s);
    let s: HashMap<_, _> = collection! { 1 => 2, 3 => 4 };
    println!("{:?}", s);
}

Previous version

This uses the (currently unstable) by-value iterator for arrays and `FromIterator`. I believe, but have not tested, that this is also relatively efficient: ```rust #![feature(array_value_iter)] macro_rules! seq { // Sequences ($($v:expr,)*) => { std::array::IntoIter::new([$($v,)*]).collect() }; ($($v:expr),*) => { std::array::IntoIter::new([$($v,)*]).collect() }; // Maps ($($k:expr => $v:expr,)*) => { std::array::IntoIter::new([$(($k, $v),)*]).collect() }; ($($k:expr => $v:expr),*) => { std::array::IntoIter::new([$(($k, $v),)*]).collect() }; } use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet}; fn main() { let s: Vec<_> = seq!(1, 2, 3); println!("{:?}", s); let s: BTreeSet<_> = seq!(1, 2, 3); println!("{:?}", s); let s: HashSet<_> = seq!(1, 2, 3); println!("{:?}", s); let s: BTreeMap<_, _> = seq!(1 => 2, 3 => 4); println!("{:?}", s); let s: HashMap<_, _> = seq!(1 => 2, 3 => 4); println!("{:?}", s); } ``` https://play.integer32.com/?version=nightly&mode=debug&edition=2018&gist=aab805bd7b61918dd68c8be2bed76764

rust-lang / rfcs

Add hashmap, hashset, treemap, and treeset macros #542