Closed masak closed 5 years ago
Before we address this issue, we should do some thinking about how we want the syntax to look. The property shorthands like varname
and method() {}
are useful for object literals; less so for dictionary literals.
Worse, the object literal syntax builds on object properties having string keys (which they do, and still will), but for dictionaries we will need something that doesn't assume string keys.
This, in the end, is why Python doesn't have autoquoted keys, I suspect.
I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.)
This is easier than it might first seem. We don't have all that many hashable built-in types:
Val::Bool
Val::Int
Val::Str
Val::Type
Val::NoneType
)Hm, Val::Sub
, Val::Macro
, Val::Regex
and Val::Exception
are sort of on the fence. We could perhaps make them hashable, but it might not be in their best interest to be hashable, so let's conservatively make them not-hashable for now.
The biggest thing to watch out for is basically that 5
won't hash the same way as "5"
.
Just as a data point...
In Ruby, hash keys can be anything. However you might need to manually rehash if you change properties of the key:
a = [ "a", "b" ]
c = [ "c", "d" ]
h = { a => 100, c => 300 }
h[a] #=> 100
a[0] = "z"
h[a] #=> nil
h.rehash #=> {["z", "b"]=>100, ["c", "d"]=>300}
h[a] #=> 100
In Ruby, hash keys can be anything. However you might need to manually rehash
Interesting. Perl 6 could essentially also have a .rehash
method if it wanted, I guess.
Putting my language designer hat on, I think 007 has enough Python in it to want to disallow up front rather than go the route of re-hashing when things mutate. I grant that the question comes down mostly to taste.
I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.
Vide this tweet and this tweet.
Here's a summary of this issue, and an attempt to enumerate the forces that apply to the design. If you feel cognitive dissonance reading the below criteria, it's because they're mutually opposing in many cases.
{}
) and colons (:
) in both, because that's what people expect. And we either want to autoquote everywhere, or nowhere.varname
and method () {}
shorthands (a) only really make sense on objects, and (b) kind of half-rely on autoquoting because that's when the sugar comes out nice.Set
and Bag
. (Those can be implemented with just string-key hashes, too, but the complexity of hashing other things would bubble up in the implementation of those data structures. The same problem would need to be solved over and over again.)Looking at the above, the path forward looks quite straightforward after all:
Also, it would mean the proposed syntax is now:
my obj = new Q::Literal::Int {
value: 42
}; # exactly the same as before, including auto-quoted keys if you want
my dict = {
"value": 42
}; # the quotes are mandatory here
hash = obj.Dictionary(); # proposed conversion method syntax
I started out calling the above type Hash
, but the conversion method .Hash()
ends up looking very much like the hashing method .hash()
, which isn't good. So Python ends up bequeathing the name "dictionary"/"dict" to 007 as well. Can't say I care much for that name, but the one upside is that it'll be easier to motivate why 007 dictionaries are so similar to Python dictionaries.
The .update
and .extend
methods would move over to dictionaries. Well, .update
could keep existing on objects too, but I don't know if there's a strong use case for it. Maybe converting-to-dictionary, then updating, then converting back is in some sense clearer and a better separation of concerns.
What should looping over a dictionary do?
In Python, it loops over the keys:
$ python3 -c'
d = { "a":1, "b":2, "c": 3 }
for k in d:
print(k)'
c
a
b
In Perl 6, it loops over the pairs:
$ perl6 -e'my %h = a => 1, b => 2, c => 3; .say for %h'
a => 1
c => 3
b => 2
In Perl 5, it loops over a flattened list of keys and values:
$ perl -Mstrict -wE'my %h = (a => 1, b => 2, c => 3); say for %h'
b
2
c
3
a
1
Finally in Java, you can't loop over a Map
directly, but you instead have to specify whether to loop over keys, values, or entries (pairs).
Since the designs are all over the place, I'm left to my own judgement on this. I kinda like Perl 6's approach, because it gives you both key and value without judging. Together with some kind of destructuring, that's quite enough. But I hadn't planned on giving 007 pairs, at least not at this point. Tuples we may get, but also not quite yet. (Update: Got 'em.)
So I think I'll go with the Java approach, which has the advantage of being the most boring and conservative one. At least then it'll be easy to upgrade to something nicer later.
Converting from hash to object would also be possible, provided the hash only had string keys.
I suddenly realized that the main use case for this would be the .create
method; see #208. For example, runtime.007
wants to do this when implementing Q::Term::Object.eval
, because its task is exactly "I have this bunch of key/value mappings (a dictionary), make me an object from that".
I think dictionaries (like Val::Object
and like Python dictionaries, but unlike Perl 5/6 hashes) should throw an exception if lookup fails. But I'm open to having one of them get() methods (with the second default being non-optional in 007's case, because we don't have optional parameters by default).
I added the "needs-more-design" label to this, because when I picture the {}
and the new Q {}
syntax co-existing, both with braces but each with its own rules about key quoting and stuff, it feels... hard to use. Program authors would trip up on it. I would trip up on it.
So we probably need two slightly more different syntaxes to distinguish them. Needs more design.
In completely unrelated news, back in November and December I started a branch for this issue, and got more or less stuck — basically the kind of stuck brought on by overwhelming changes without a really clear direction.
I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.
I think this early requirement might have been a mistake. Even though using any hashable type for keys is nice, language-wise, the cost for it is pretty big.
Notably, the syntax
{ foo: 42 }
in a language that only supports strings as keys is "free" to assume that foo
is a string key here, like so:
{ "foo": 42 }
(Python, on the other hand, can't.)
So firstly, it's nice to be able to skip quotes on dictionary keys.
Secondly, the above syntax is strangely consistent with the short-hand:
{ foo }
For
{ foo: foo }
That is
{ "foo": foo }
There is some other syntactic bit here that's also bugging me about not being able to to assume that keys are strings... related to object creation. But I can't quite put it into words now.
Why do I want that? Because experience shows that hashes are central to many other things one might want to build, and allowing anything-hashable makes some code a lot nicer. Sets, bags, graphs come to mind. And I know more than one reason to want to key on tuples/enums, once we get them.
I mean, this is all still true, it's just that the language design cost is higher than I expected.
Oh, hey — add Object
(after #144) as a hashable type, too. Trivially hashable, even. It's counter to how people will probably use Object
instances most of the time... but they are hashable and if someone wants to, they could.
Oh, hey — add
Object
(after #144) as a hashable type, too.
This... does not seem like a good idea. Not sure what I was thinking. :smile: If we make Object
hashable, then everything is suddenly hashable.
So we probably need two slightly more different syntaxes to distinguish them. Needs more design.
This seems to be resolving itself now, since we're moving in the direction of #307 and #308. Sure, with #299 in the mix we'll still have the colon in both syntaxes, but they'll be differentiated enough I think by the different contexts they occur in.
Removing the needs-more-design label.
It just occurred to me that we might be able to steal Perl 6's :{}
syntax directly for hashes/dictionaries that take any hashable key, not just strings. Just a thought.
It's possible that this syntax need not be on my default. Maybe one can import it via a module.
As fell out of the discussion in #144.
I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.
Why do I want that? Because experience shows that hashes are central to many other things one might want to build, and allowing anything-hashable makes some code a lot nicer. Sets, bags, graphs come to mind. And I know more than one reason to want to key on tuples/enums, once we get them. So the high road it is.
What should "hashable" mean for the purpose of this issue? Let's go with "has a
hash
method". Maybe some requirements should be placed on that method, but that can probably wait. Also, once 007 has interfaces, thenHashable
should probably be one. But all in due time.A number of methods that currently sit on
Val::Object
(fsvo "sit") ought to move to and/or be rethought forVal::Hash
:get
keys
has
update
/extend
Some of these should maybe be available in some form on
Val::Object
too? For example,Q::Postfix::Property
inruntime.007
is currently implemented as a hash indexing operation:This is clearly leaning on the "objects-as-hashes" confusion we're about to erase — in other words, there's already a relevant use case in user code that we don't want to erase completely.
I don't feel ready to propose a solution yet. Just noting that sometimes it's useful to treat an object as something slightly hash-like.