Open rust-highfive opened 9 years ago
For my own part,
My preference is for :
above =>
- colons are used in struct initialisations, arrows in pattern matching. And we're initializing something similar to a struct (somewhat over-simplified, a map is a struct with dynamic keys), we're not pattern matching it. So colon are a better fit with the rest of the language IMO.
Also, I would vote for just getting something ergonomic in for 1.0 and think about optimisations later. So given this syntax:
hashmap!("foo": 5, "bar": 8)
Just something that extracts to
{
a = Hashmap::new();
a.insert("foo", 5);
a.insert("bar", 8);
a
}
...would be fine.
My preference is :
above =>
,
and use {}
instead of ()
.
For example:
hashmap! {"foo": 42, "bar": 64}
This is similar to JSON and Python, and has the same "theme" as vec!
, which uses []
.
The JSON similarity makes it easier to directly copy a JSON snippet into a Rust code.
And :
is also used in structure initialization. =>
is something used in match
and emphasizes the "result" of a pattern.
I believe {
is only supposed to be used for macros that define top-level items like structs, functions, etc.
Yeah we should really use :
.
So is there any movement?
@aspcartman Not really - a more general abstraction is needed for std, but you could easily land these in a cargo crate.
Very sorry. I am new to rust. I wanted to create a HashSet with some default values. I found the vec->to_iter->collect pattern and thought it could be nicer. Then i found this thread.
Are macros favored over functions? I would love to see a function on structures like new that took a default object to convert from. Is there something like this?
Since you cant overload a method maybe it can use trait bounds? And each struct in the std can define its conversion for each other. If someone wants to have a new conversion they define it for their object?
Macros are never favoured, but that's just the fact if you can solve it with a function, you should. Macros may be the only way to provide this functionality with the features we want, for example avoiding any intermediate extra allocation.
Thanks @bluss . I now understand that using a function with trait bounds requires a second object to convert from where macros do not. I dont understand macros well enough. At some point they turn in to real code. Can I view what that code is doing?
In this example http://rustbyexample.com/macros/repeat.html how does $y store all the rest of the arguments? Can I write a function that does the same thing? I haven't found an example or documentation for it.
thanks
Sure, you can use rustc -Z unstable-options --pretty expanded SOURCEFILE
where SOURCEFILE is one of your files, it will expand uses of macros into code.
$y
in that example is inside $( ... ),+
which means it's a repeated variable (separated by comma). Macros use this to support arbitrary many arguments, something rust functions don't.
I published a crate maplit on crates.io with simple type-specific hashmap!{ }
and similar macros so that they are finally available.
The :
separator is not available to regular macro_rules!
macros after an expr
, so maplit uses =>
as key-value separator.
Prior art in this business include generic container literals in literator and generic container literals in grabbag_macros and an older crate with generic container literals called construct.
The generic solutions all have some drawbacks, so I think it's nice to have the type-specific macros out there as well.
Personally, I would use brackets (since braces are apparently reserved for other things, as Gankro said). Swift uses brackets for array and mapping literals, and I'm sure it's not the only one.
Also, I wonder if this could be generalized to support all Vec
-like and mapping types, rather than having separate macros for each one. Something like let hash_map: HashMap<_> = mapping![key1: value1, key2: value2, ...];
…maybe even adding syntactic sugar so we don't need a macro?
Personally, I would use brackets
This is something the user, not implementor, decides. For example all these work and are equivalent:
let a = vec!{1, 3, 5};
let b = vec![1, 3, 5];
let c = vec!(1, 3, 5);
@nagisa, right but there should be a convention.
Is there any movement on this? Because it sounds really nice to have. I agree with the using the :
syntax.
It would be nice if this was in the standard library. Could we get maplit
added to the Rust nursery?
@JelteF Unfortunately :
syntax isn't possible. They were talking about it above.
+1 I believe this should be in the std library as well.
Hi,
I believe this should be in the std library for a couple of reasons:
vec!
: It's weird to have only one macro to initialize a Vec
and nothing for other containers.(You can see this story for more informations. It's a fake, but the point still stands. https://hackernoon.com/im-harvesting-credit-card-numbers-and-passwords-from-your-site-here-s-how-9a8cb347c5b5 )
If there are different types of maps, map!{}
would still be possible as the type could be inferred with a (existing or new?) trait that gives ::new()
and ::insert()
.
So that it becomes
let map: HashMap = map!{"key" => "value"};
@stevenroose You can always try it out with different operations on the playground... One less great aspect of doing it that way is that you are sort-of hoping that the right operations exist on the map type... If we had some way of being polymorphic over maps in the type system (we need to be able to quantify over trait
s, i.e. fn foo<trait Bar>(...)
and then have associated traits...), then it would work better I think.
I'd suggest map!{"key1" => "value1", ...}
being sugar for [("key1", "value1"), ...].iter().collect()
and set!{"key1", ...}
being sugar for ["key1", ...].iter().collect()
because those are basically optimal for any probabilistic data structure.. or else simple show that .iter().collect()
is idiomatic under Iter::collect()
and HashMap
. I think treemap!{...}
etc should be reserved for if you can build the tree at compile time via dependent types or whatever.
What do you think of the following approach? It seems quite pleasant to me.
Recall that Into<T> for T
is implemented, i.e. Into<T>
is implemented for all types T
.
#![allow(unused_macros)]
#![allow(unused_imports)]
use std::collections::HashMap;
use std::collections::BTreeMap;
trait MapLiteral<K,V> {
fn new() -> Self;
fn with_capacity(n: usize) -> Self;
fn insert(m: &mut Self, k: K, v: V);
}
impl<K,V> MapLiteral<K,V> for HashMap<K,V>
where K: std::cmp::Eq+std::hash::Hash
{
fn new() -> Self {HashMap::new()}
fn with_capacity(n: usize) -> Self {HashMap::with_capacity(n)}
fn insert(m: &mut Self, k: K, v: V){m.insert(k,v);}
}
impl<K,V> MapLiteral<K,V> for BTreeMap<K,V>
where K: std::cmp::Ord
{
fn new() -> Self {BTreeMap::new()}
fn with_capacity(_n: usize) -> Self {BTreeMap::new()}
fn insert(m: &mut Self, k: K, v: V){m.insert(k,v);}
}
// replace ,* by ,* $(,)?
macro_rules! map {
($( $key:tt: $value:expr ),*) => {{
let mut _temp_map = MapLiteral::new();
$(MapLiteral::insert(&mut _temp_map,$key.into(),$value.into());)*
_temp_map
}};
({$init:expr} {$( $key:tt: $value:expr ),*}) => {{
let mut _temp_map = $init;
$(MapLiteral::insert(&mut _temp_map, $key.into(),$value.into());)*
_temp_map
}};
({$init:expr; $tk:expr, $tv:expr} {$( $key:tt: $value:expr ),*}) => {{
let mut _temp_map = $init;
$(MapLiteral::insert(&mut _temp_map,$tk($key),$tv($value));)*
_temp_map
}};
({$tk:expr, $tv:expr} {$( $key:tt: $value:expr ),*}) => {{
let mut _temp_map = MapLiteral::new();
$(MapLiteral::insert(&mut _temp_map,$tk($key),$tv($value));)*
_temp_map
}};
}
Let's try it out.
fn main() {
let m: HashMap<i32,i32> = map!{};
println!("{:?}",m);
let m: HashMap<i32,i32> = map!{1: 2, 2: 1};
println!("{:?}",m);
let m: HashMap<Box<str>,i32> = map!{"x": 1, "y": 2};
println!("{:?}",m);
let m: HashMap<&str,i32> = map!{"x": 1, "y": 2};
println!("{:?}",m);
let m: HashMap<String,i32> = map!{"x": 1, "y": 2};
println!("{:?}",m);
let m: HashMap<Box<str>,i32> = map!{"x": 1, "y": 2};
println!("{:?}",m);
let m: HashMap<i32,Option<i32>> = map!{1: 2, 2: None};
println!("{:?}",m);
let m: HashMap<(i32,i32),i32> = map!{(1,1): 1, (1,2): 2};
println!("{:?}",m);
let m: HashMap<[i32;2],i32> = map!{[1,1]: 1, [1,2]: 2};
println!("{:?}",m);
let m: HashMap<Vec<i32>,i32> = map!{
(vec![1,2]): 1,
(vec![1,2,3]): 2
};
println!("{:?}",m);
let m: BTreeMap<String,HashMap<i32,i32>> = map!{
"x": HashMap::from(map!{1: 1, 2: 2}),
"y": HashMap::from(map!{1: 2, 2: 1})
};
println!("{:?}",m);
let m = map!{{HashMap::<i32,i32>::with_capacity(100)}{
1: 2, 2: 1
}};
println!("{:?}",m);
let m: HashMap<Box<String>,i32> = map!{{
|x| Box::new(String::from(x)), |x| x
}{
"x": 1, "y": 2
}};
println!("{:?}",m);
let m: HashMap<String,Box<[i32]>> = map!{{
String::from, |x| Box::new(x) as Box<[i32]>
}{
"x": [1,1], "y": [1,2]
}};
println!("{:?}",m);
let m: HashMap<String,Box<dyn std::any::Any>> = map!{{
String::from, |x| Box::new(x) as Box<dyn std::any::Any>
}{
"x": 1, "y": "a", "z": [1,2]
}};
println!("{:?}",m);
}
How is that any better than map!{ a => b, ... }
being sugar for [(a,b), ...].iter().collect()
? In fact, I'd expect the MapLitteral
design always creates unrolled loops, which bloats the code for zero performance improvement.
Well, the iter
solution must clone the elements because it yields references, where unrolling the loop doesn't have to.
I see. In that case, maybe map!
should avoid the clone because doing that lacks any convenient solution, but.. Anyone who cares about performance should still write out [(a,b), ...].iter().cloned().collect()
manually. Rust's std should probably favor the performant solution though, which suggests map!
should be provided by an external crate, not std itself.
But clones could be more expensive to do and not all types support clones, so that doesnt work. We could use ArrayVec
, but I don't see how that's more efficient than unrolling the loop.
I think that macros for hash collections should definitely be in std. insert()
reminds me of the times of Java when you have no way to initialize a map without resorting to the put()
method, and in Kotlin lang, finally they made hashMapOf()
and mapOf()
functions, cheers!
[(a, b), ...].iter().cloned().collect()
seems to me a tricky and not very beautiful trick. Similar tricks are used in languages without syntactic sugar for map, and which don't have macros. This trick should not be the only option in a modern language with macros, like Rust. It makes me sad.
What exactly is blocking progress on this? Are people still conflicted as to whether this should be in the standard library? I think consistency with vec!
is a good aspiration and we should determine a performant way of achieving that.
The only thing blocking this issue is someone creating an RFC. As far as I'm concerned, maplit already has appropriate definitions and could be pulled into the standard library, although there are some outstanding issues that might be worth resolving first.
Since it's a simple library addition, one could just file a rust-lang/rust PR directly.
We've discussed doing this with either collect
or insert
ala maplit, which permit mutability and seed from RandomState
, but incurs some performance hit over compile time construction. It's worth mentioning however that one could do "true literal" variants that built the data structure at compile time:
In principle, btreemap!
could return a &'static BTreeMap
in which the internal allocations point into the data segment, but you cannot free them due the &'static
.
If otoh you attempt this with HashMap then HashMap: Clone
clones the RandomState instance, which enables DDoS attacks. We've discussed that HashMap: Clone
represents a larger DDoS risk before, not just in this context. It's clear HashMap needs some fast_clone
method that ignores DDoS resistance. As .iter().cloned.collect()
provides the secure variant already, I think the Clone trait remains a valid choice for this fast_clone
method, but only if we add some lint against using HashMap: Clone
. This is not the only place were one needs a trait impl but also needs to lint against using it. Anyways, I suppose pub struct ConstHashMap<K,V>(HashMap<K,V,CompileTimeRandomState>));
could provide similar inherent methods, but implement clone differently.
The above approaches are not really optimal because they do not require allocation, but depend upon alloc
for the HashMap or BTreeMap types. Instead I think an optimal solution resembles SmallVec
, meaning:
SmallBTreeMap
is mostly BTreeMap
but does allocations like SmallVec
and supports const operation.SmallHashMap
is mostly HashMap
but does allocations like SmallVec
and supports const operation, provided you use some CompileTimeRandomState
for a const BuildHasher
. Ain't clear core needs Small*
types of course, but seemed worth mentioning.
Since it's a simple library addition, one could just file a rust-lang/rust PR directly.
I think there may be enough design decisions open for discussion to warrant an RFC, but one could always submit a pull request and see what the libs team thinks.
If you need someone to write an RFC, I'm up!
As those macros are mostly for convenience and have been awaited for quite a long time, I propose that we simply implement four macros, hashmap!
, hashset!
, treemap!
, and treeset!
, which expand in what we would see in code today (manually creating and filling the structure) as opposed to a generic solution like the seq!
macro proposed in #207.
If the solution above fills the expectations of this issue, I can start working on the writing of the RFC. Please tell me if I should @varkor
It's possible to write a version that's applicable to multiple types of collections using the by-value iterator for arrays:
macro_rules! collection {
// map-like
($($k:expr => $v:expr),* $(,)?) => {{
use std::iter::{Iterator, IntoIterator};
Iterator::collect(IntoIterator::into_iter([$(($k, $v),)*]))
}};
// set-like
($($v:expr),* $(,)?) => {{
use std::iter::{Iterator, IntoIterator};
Iterator::collect(IntoIterator::into_iter([$($v,)*]))
}};
}
use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet};
fn main() {
let s: Vec<_> = collection![1, 2, 3];
println!("{:?}", s);
let s: BTreeSet<_> = collection! { 1, 2, 3 };
println!("{:?}", s);
let s: HashSet<_> = collection! { 1, 2, 3 };
println!("{:?}", s);
let s: BTreeMap<_, _> = collection! { 1 => 2, 3 => 4 };
println!("{:?}", s);
let s: HashMap<_, _> = collection! { 1 => 2, 3 => 4 };
println!("{:?}", s);
}
See also the macro-free version.
@shepmaster: this is very neat. I'd be in favour of a generic initialisation macro like this if it was demonstrated to be efficient.
@shepmaster This is a better, general and extendable macro, and which does not rely on a new trait to be created. +1
That seems really cool, yeah. However, does it really make sense to this of BTreeMap
and HashMap
as sequences? I'd rather have two separate macros, seq!
and map!
.
Also like others wrote before, I think we should consider using :
instead of =>
, because =>
is only used for pattern matching otherwise. That's a departure from what maplit
does, but more consistent with the language IMHO.
@jplatte unfortunately we can't use :
if the keys need to be arbitrary :expr
because of type ascription.
@kennytm what would the possible workarounds for that be? IMHO requiring parentheses for type ascription in that position would be the best solution but I understand that that's probably not that easy to implement, right?
:
is heavily overloaded in Rust, which has already caused problems with features like type ascription; I think we should avoid creating new meaning for it where possible (even in macros).
=>
does seem a little ad hoc at first, but the construction does act somewhat like matching:
let hash_map: HashMap<Key, Value> = map!("a" => 1, "b" => 2, "c" => 3);
// …is analogous to a function acting as a map…
let hash_map: impl Fn(Key) -> Value = |key| match key {
"a" => 1,
"b" => 2,
"c" => 3,
};
map!(..)
can then just (be thought of as) a shorthand for the boilerplate |key| match key { .. }
, which makes the syntax seem less offensive.
@jplatte If we implement map!
as a proc macro (compiler built-in) it would be easy.
We could also do it like serde_json::json!
by :tt
-munching, but the recursion_limit would be like O(total length of keys).
macro_rules! map {
(@ A=[$($array:tt)*], K=[], R=[]) => {
std::array::IntoIter::new([$($array)*]).collect()
};
(@ A=[$($array:tt)*], K=[$($key:tt)+], R=[: $value:expr, $($rest:tt)*]) => {
map!(@ A=[$($array)* ($($key)+, $value),], K=[], R=[$($rest)*])
};
(@ A=[$($array:tt)*], K=[$($key:tt)+], R=[: $value:expr]) => {
map!(@ A=[$($array)* ($($key)+, $value),], K=[], R=[])
};
(@ A=[$($array:tt)*], K=[$($key:tt)*], R=[$k:tt $($rest:tt)*]) => {
map!(@ A=[$($array)*], K=[$($key)* $k], R=[$($rest)*])
};
(@ A=[$($array:tt)*], K=[$($key:tt)*], R=[]) => {
compile_error!(concat!("missing value for key ", stringify!($($key)*)))
};
($($x:tt)*) => {
map!(@ A=[], K=[], R=[$($x)*])
};
}
It was pointed out to me that there's a macro-free solution, now that we directly implement IntoIterator
for arrays:
use std::collections::{BTreeMap, BTreeSet, HashMap, HashSet};
use std::iter::FromIterator;
fn main() {
// Rust 1.53
let s = Vec::from_iter([1, 2, 3]);
println!("{:?}", s);
let s = BTreeSet::from_iter([1, 2, 3]);
println!("{:?}", s);
let s = HashSet::<_>::from_iter([1, 2, 3]);
println!("{:?}", s);
let s = BTreeMap::from_iter([(1, 2), (3, 4)]);
println!("{:?}", s);
let s = HashMap::<_, _>::from_iter([(1, 2), (3, 4)]);
println!("{:?}", s);
}
See also the macro version.
Some competition on the external crate front: https://docs.rs/velcro
It uses :
for the separator:
use velcro::hash_map;
let map = hash_map! {
"foo": 1,
"bar: 2,
};
And since owned String
keys are often necessary, there's a version that does Into
conversions:
use velcro::hash_map_from;
let map: HashMap<String, i32> = hash_map_from! {
"foo": 1, // str literals but the map keys are String
"bar: 2,
};
oh, and the ..
operator for impls of IntoIterator
:
use velcro::hash_map;
let map = hash_map! {
..('0'..='9'): "digit",
..('a'..='z'): "lower",
..('A'..='Z'): "upper",
'.': "punctuation",
',': "punctuation",
};
Any solution involving an intermediate array, like the two that @shepmaster posted, runs the risk of the optimizer not optimizing the array out. And if the array remains, that also leads to value being read and memmoved from the array to its final place. This means wasted space for an array that is immediately consumed, as well as unnecessary byte copying, both of which matter for containers with lots of values or large values.
Eg https://rust.godbolt.org/z/9eoWEa - foo
creates an unnecessary [u128; 20]
at rsp[24]
and then moves each element to the parameter register for each call to insert()
. bar
does not create the array and directly populates the value into the parameter register for each call to insert()
.
Of course anyone using [].iter().copied().collect()
today already has this same problem. But a macro doesn't.
I used to really want this macro, but now that rust-lang/rust#84111 is merged, creating a HashMap
, HashSet
, BTreeMap
, or BTreeSet
is very clean and simple:
let map = HashMap::from([
("a", 1),
("b", 2),
("c", 3),
]);
let tree = BTreeMap::from([
("a", 1),
("b", 2),
("c", 3),
]);
let hash_set = HashSet::from([1, 2, 3, 4, 5, 6]);
let tree_set = BTreeSet::from([1, 2, 3, 4, 5, 6]);
I don't really see a macro as a huge improvement. Formatting is not as nice when the tuple elements are put on separate lines though:
let map = HashMap::from([
(
"a",
some_really_long_method_chain_that_is_actually_very_looooong(),
),
("b", foo()),
("c", bar()),
]);
let map: HashMap<_, _> = collection! {
"a" => some_really_long_method_chain_that_is_actually_very_looooong(),
"b" => foo(),
"c" => bar(),
};
And it doesn't look as nice with nested or complex maps:
let popular_tech = HashMap::from([
(
"languages",
HashMap::from([
("rust-lang/rust", 58_300_000),
("golang/go", 89_000_000),
("apple/swift", 57_100_000),
]),
),
(
"web-frameworks",
HashMap::from([
("actix/actix-web", 12_000_000),
("gin-gonic/gin", 51_000_000),
("vapor/vapor", 20_600_000),
]),
),
]);
let popular_tech = hashmap! {
"languages" => hashmap! {
"rust-lang/rust" => 58_300_000,
"golang/go" => 89_000_000,
"apple/swift" => 57_100_000,
},
"web-frameworks" => hashmap! {
"actix/actix-web" => 12_000_000,
"gin-gonic/gin" => 51_000_000,
"vapor/vapor" => 20_600_000,
},
};
But for the simple case, which is probably the most common, HashMap::from
works very well. Maybe the solution for the previous example is to change rustfmt to keep the keys and the start of the value on the same line?
let popular_tech = HashMap::from([
("languages", HashMap::from([
("rust-lang/rust", 58_300_000),
("golang/go", 89_000_000),
("apple/swift", 57_100_000),
])),
("web-frameworks", HashMap::from([
("actix/actix-web", 12_000_000),
("gin-gonic/gin", 51_000_000),
("vapor/vapor", 20_600_000),
])),
]);
I do really like the visual mapping of key to value that the arrow syntax provides, but the above looks really nice as well.
note that the stabilized HashMap::from
is only implemented for the default RandomState
, because of https://github.com/rust-lang/rust/pull/84111#issuecomment-821896963.
if you have a non-RandomState
HashMap, you could still use the from_iter
solution in https://github.com/rust-lang/rfcs/issues/542#issuecomment-576354291.
Issue by gsingh93 Saturday Jun 07, 2014 at 16:49 GMT
For earlier discussion, see https://github.com/rust-lang/rust/issues/14726
This issue was labelled with: A-syntaxext in the Rust repository
I wanted to create an issue first asking about this before submitting a pull request.
Can I go ahead an implement
hashmap!()
,hashset!()
,treemap!()
, andtreeset!()
macros for constructing those collections with the given arguments? The syntax would be:I already have these macros implemented in my own projects, so I'd just have to add them to macros.rs.
If I can add these, is there a process for testing macros? Or would I just replace all occurrences of hash{map,set} and tree{map,set} creation in the tests by the macros?