masak / alma

ALgoloid with MAcros -- a language with Algol-family syntax where macros take center stage
Artistic License 2.0
137 stars 15 forks source link

Introduce a dictionary type #184

Closed masak closed 5 years ago

masak commented 8 years ago

As fell out of the discussion in #144.

I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.

Why do I want that? Because experience shows that hashes are central to many other things one might want to build, and allowing anything-hashable makes some code a lot nicer. Sets, bags, graphs come to mind. And I know more than one reason to want to key on tuples/enums, once we get them. So the high road it is.

What should "hashable" mean for the purpose of this issue? Let's go with "has a hash method". Maybe some requirements should be placed on that method, but that can probably wait. Also, once 007 has interfaces, then Hashable should probably be one. But all in due time.

A number of methods that currently sit on Val::Object (fsvo "sit") ought to move to and/or be rethought for Val::Hash:

Some of these should maybe be available in some form on Val::Object too? For example, Q::Postfix::Property in runtime.007 is currently implemented as a hash indexing operation:

        Q::Postfix::Property(op) {
            return eval(op.operand)[eval(op.property)];
        },

This is clearly leaning on the "objects-as-hashes" confusion we're about to erase — in other words, there's already a relevant use case in user code that we don't want to erase completely.

I don't feel ready to propose a solution yet. Just noting that sometimes it's useful to treat an object as something slightly hash-like.

masak commented 8 years ago

Before we address this issue, we should do some thinking about how we want the syntax to look. The property shorthands like varname and method() {} are useful for object literals; less so for dictionary literals.

Worse, the object literal syntax builds on object properties having string keys (which they do, and still will), but for dictionaries we will need something that doesn't assume string keys.

This, in the end, is why Python doesn't have autoquoted keys, I suspect.

masak commented 8 years ago

I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.)

This is easier than it might first seem. We don't have all that many hashable built-in types:

Hm, Val::Sub, Val::Macro, Val::Regex and Val::Exception are sort of on the fence. We could perhaps make them hashable, but it might not be in their best interest to be hashable, so let's conservatively make them not-hashable for now.

The biggest thing to watch out for is basically that 5 won't hash the same way as "5".

vendethiel commented 8 years ago

Just as a data point...

In Ruby, hash keys can be anything. However you might need to manually rehash if you change properties of the key:

a = [ "a", "b" ]
c = [ "c", "d" ]
h = { a => 100, c => 300 }
h[a]       #=> 100
a[0] = "z"
h[a]       #=> nil
h.rehash   #=> {["z", "b"]=>100, ["c", "d"]=>300}
h[a]       #=> 100
masak commented 8 years ago

In Ruby, hash keys can be anything. However you might need to manually rehash

Interesting. Perl 6 could essentially also have a .rehash method if it wanted, I guess.

Putting my language designer hat on, I think 007 has enough Python in it to want to disallow up front rather than go the route of re-hashing when things mutate. I grant that the question comes down mostly to taste.

masak commented 7 years ago

I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.

Vide this tweet and this tweet.

masak commented 7 years ago

Here's a summary of this issue, and an attempt to enumerate the forces that apply to the design. If you feel cognitive dissonance reading the below criteria, it's because they're mutually opposing in many cases.

Looking at the above, the path forward looks quite straightforward after all:

Also, it would mean the proposed syntax is now:

my obj = new Q::Literal::Int {
    value: 42
};    # exactly the same as before, including auto-quoted keys if you want

my dict = {
    "value": 42
};    # the quotes are mandatory here
hash = obj.Dictionary();    # proposed conversion method syntax

I started out calling the above type Hash, but the conversion method .Hash() ends up looking very much like the hashing method .hash(), which isn't good. So Python ends up bequeathing the name "dictionary"/"dict" to 007 as well. Can't say I care much for that name, but the one upside is that it'll be easier to motivate why 007 dictionaries are so similar to Python dictionaries.

masak commented 7 years ago

The .update and .extend methods would move over to dictionaries. Well, .update could keep existing on objects too, but I don't know if there's a strong use case for it. Maybe converting-to-dictionary, then updating, then converting back is in some sense clearer and a better separation of concerns.

masak commented 7 years ago

What should looping over a dictionary do?

In Python, it loops over the keys:

$ python3 -c'
d = { "a":1, "b":2, "c": 3 }
for k in d:
    print(k)'
c
a
b

In Perl 6, it loops over the pairs:

$ perl6 -e'my %h = a => 1, b => 2, c => 3; .say for %h'
a => 1
c => 3
b => 2

In Perl 5, it loops over a flattened list of keys and values:

$ perl -Mstrict -wE'my %h = (a => 1, b => 2, c => 3); say for %h'
b
2
c
3
a
1

Finally in Java, you can't loop over a Map directly, but you instead have to specify whether to loop over keys, values, or entries (pairs).

Since the designs are all over the place, I'm left to my own judgement on this. I kinda like Perl 6's approach, because it gives you both key and value without judging. Together with some kind of destructuring, that's quite enough. But I hadn't planned on giving 007 pairs, at least not at this point. Tuples we may get, but also not quite yet. (Update: Got 'em.)

So I think I'll go with the Java approach, which has the advantage of being the most boring and conservative one. At least then it'll be easy to upgrade to something nicer later.

masak commented 7 years ago

Converting from hash to object would also be possible, provided the hash only had string keys.

I suddenly realized that the main use case for this would be the .create method; see #208. For example, runtime.007 wants to do this when implementing Q::Term::Object.eval, because its task is exactly "I have this bunch of key/value mappings (a dictionary), make me an object from that".

masak commented 7 years ago

I think dictionaries (like Val::Object and like Python dictionaries, but unlike Perl 5/6 hashes) should throw an exception if lookup fails. But I'm open to having one of them get() methods (with the second default being non-optional in 007's case, because we don't have optional parameters by default).

masak commented 7 years ago

I added the "needs-more-design" label to this, because when I picture the {} and the new Q {} syntax co-existing, both with braces but each with its own rules about key quoting and stuff, it feels... hard to use. Program authors would trip up on it. I would trip up on it.

So we probably need two slightly more different syntaxes to distinguish them. Needs more design.

In completely unrelated news, back in November and December I started a branch for this issue, and got more or less stuck — basically the kind of stuck brought on by overwhelming changes without a really clear direction.

masak commented 7 years ago

I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.

I think this early requirement might have been a mistake. Even though using any hashable type for keys is nice, language-wise, the cost for it is pretty big.

Notably, the syntax

{ foo: 42 }

in a language that only supports strings as keys is "free" to assume that foo is a string key here, like so:

{ "foo": 42 }

(Python, on the other hand, can't.)

So firstly, it's nice to be able to skip quotes on dictionary keys.

Secondly, the above syntax is strangely consistent with the short-hand:

{ foo }

For

{ foo: foo }

That is

{ "foo": foo }

There is some other syntactic bit here that's also bugging me about not being able to to assume that keys are strings... related to object creation. But I can't quite put it into words now.

masak commented 7 years ago

Why do I want that? Because experience shows that hashes are central to many other things one might want to build, and allowing anything-hashable makes some code a lot nicer. Sets, bags, graphs come to mind. And I know more than one reason to want to key on tuples/enums, once we get them.

I mean, this is all still true, it's just that the language design cost is higher than I expected.

masak commented 7 years ago

Oh, hey — add Object (after #144) as a hashable type, too. Trivially hashable, even. It's counter to how people will probably use Object instances most of the time... but they are hashable and if someone wants to, they could.

masak commented 6 years ago

Oh, hey — add Object (after #144) as a hashable type, too.

This... does not seem like a good idea. Not sure what I was thinking. :smile: If we make Object hashable, then everything is suddenly hashable.

masak commented 6 years ago

So we probably need two slightly more different syntaxes to distinguish them. Needs more design.

This seems to be resolving itself now, since we're moving in the direction of #307 and #308. Sure, with #299 in the mix we'll still have the colon in both syntaxes, but they'll be differentiated enough I think by the different contexts they occur in.

Removing the needs-more-design label.

masak commented 5 years ago

It just occurred to me that we might be able to steal Perl 6's :{} syntax directly for hashes/dictionaries that take any hashable key, not just strings. Just a thought.

It's possible that this syntax need not be on my default. Maybe one can import it via a module.