Raku / doc

🦋 Raku documentation
https://docs.raku.org/
Artistic License 2.0
286 stars 293 forks source link

Agreeing on doc, and doc contribution style guide, about hash construction; {...}, %(...), hash(), ... #2117

Closed Juerd closed 6 years ago

Juerd commented 6 years ago

The problem

The style guide suggests using {} for empty hashes and %() for non-empty hashes. This is inconsistent.

Suggestions

Since {} is considered confusing (although technically it's supposed to be straight forward) with blocks, and empty %() is considered a trap (see https://github.com/rakudo/rakudo/issues/1946), why not just use hash instead? According to jnthn in https://github.com/rakudo/rakudo/issues/1946#issuecomment-399396270 these are a "sure bet".

Suggested: avoid both %() (empty construct is a trap) and {} (possible confusion with blocks) for constructing hashes, and always use hash instead:

my $foo = hash;  # empty hash
my $bar = hash a => 'b', c => 'd';  # non-empty hash

Note: these can be used with or without parentheses. Since that goes for most expressions, that could probably use its own recommendation in the style guide.

JJ commented 6 years ago

El vie., 22 jun. 2018 a las 17:26, Juerd Waalboer (notifications@github.com) escribió:

The problem

The style guide https://github.com/perl6/doc/blob/master/writing-docs/STYLEGUIDE.md suggests using {} for empty hashes and %() for non-empty hashes. This is inconsistent.

It's not inconsistent, it's simply true...

Suggestions

Since {} is considered confusing (although technically it's supposed to be straight forward) with blocks, and empty %() is considered a trap (see rakudo/rakudo#1946 https://github.com/rakudo/rakudo/issues/1946), why not just use hash instead? According to jnthn in rakudo/rakudo#1946 (comment) https://github.com/rakudo/rakudo/issues/1946#issuecomment-399396270 these are a "sure bet".

Suggested: avoid both %() (empty construct is a trap) and {} (possible confusion with blocks) for constructing hashes, and always use hash instead:

my $foo = hash; # empty hash my $bar = hash a => 'b', c => 'd'; # non-empty hash

Well, this is the style guide. The main problem I see is that if we change it now we might have to change a whole lot of stuff to make it consistent with it. As a matter of fact, this change was made just today and we'll have to go through what is there now to check that we're not initializing using %().

On the other hand, well, this is the style guide. If we don't enforce consistent style in it, such as the one you suggest, I don't know where else could be...

I guess I'll have a look and see how much needs to be changed. And anyway, thanks for the suggestion...

rafaelschipiura commented 6 years ago

I'm completely against this. You'll simply not see code like that in the wild. People should be exposed to code somewhat like they will encounter in the ecosystem .

And the {} %(...) distinction is fine and, besides a WAT at first, works just fine in practice.

Juerd commented 6 years ago

You'll simply not see code like that in the wild. People should be exposed to code somewhat like they will encounter in the ecosystem.

Well, is the documentation to be descriptive or prescriptive? The existence of a style guide, and the exclusion of { ... } for hashes because it can be confused with blocks, both strongly suggest prescriptivism. But your argument is based on descriptivism. That's a fair point of view, and maybe the example code in the documentation should indeed look like the code in the ecosystem, but in that case the documentation should not shy away from using { ... } for non-empty hashes.

Juerd commented 6 years ago

By the way, if the style guide does want to prescribe the use of %(...) for creating hashes, instead of hash, the following uses of %(...) with an Iterable of zero items seem to reliably return a new empty hash, unlike empty %():

%(())
%([])
%({})  # lol
%(Empty)
%{}

And then there are%(Nil), %(Any), and even %(Int). I don't know why Any.hash(Any:U:) exists, but it enables a lot of new and confusing ways to create an empty hash. How about %(X::Promise::CauseOnlyValidOnBroken)? ;-)

Juerd commented 6 years ago

There are some interesting performance differences, too!

Benchmark:

use Test;

my $foo;
my %subs := {
    'hash'     => sub { $foo = hash },
    '%(Nil)'   => sub { $foo = %(Nil) },
    '{}'       => sub { $foo = {} },
    '%(Empty)' => sub { $foo = %(Empty) },
    '%({})'    => sub { $foo = %({}) },
    'my %'     => sub { $foo = my % },
};

for %subs.kv -> $k, $v {
    say "Testing $k";
    $foo = "bad";
    $v();
    isa-ok($foo, Hash); 
    is($foo.elems, 0);
};

Bench.new.timethese: 1000000, %subs;

Output:

Testing {}
ok 1 - The object is-a 'Hash'
ok 2 - 
Testing %(Nil)
ok 3 - The object is-a 'Hash'
ok 4 - 
Testing hash
ok 5 - The object is-a 'Hash'
ok 6 - 
Testing my %
ok 7 - The object is-a 'Hash'
ok 8 - 
Testing %({})
ok 9 - The object is-a 'Hash'
ok 10 - 
Testing %(Empty)
ok 11 - The object is-a 'Hash'
ok 12 - 
Benchmark: 
Timing 1000000 iterations of %(Empty), %(Nil), %({}), hash, my %, {}...
  %(Empty): 2.790 wallclock secs (2.797 usr 0.000 sys 2.792 cpu) @ 358382.348/s (n=1000000)
    %(Nil): 1.777 wallclock secs (1.779 usr 0.000 sys 1.779 cpu) @ 562757.602/s (n=1000000)
     %({}): 3.194 wallclock secs (3.194 usr 0.000 sys 3.194 cpu) @ 313061.358/s (n=1000000)
      hash: 0.139 wallclock secs (0.150 usr 0.004 sys 0.154 cpu) @ 7211365.111/s (n=1000000)
      my %: 0.112 wallclock secs (0.110 usr 0.004 sys 0.114 cpu) @ 8926817.946/s (n=1000000)
        {}: 3.263 wallclock secs (3.270 usr 0.000 sys 3.270 cpu) @ 306425.967/s (n=1000000)

Going by the benchmark, &hash is the obvious winner, if you ignore the more obscure my %, which is basically how &hash is implemented. Does anyone understand why it is so much faster, though?

labster commented 6 years ago

You forgot Hash.new and Hash(Empty) and Hash(()). Hash() returns Hash(Any) which is not what we want. TIMTOWTDI!!!

Benchmark (sorted), because it's interesting:

      %( ): 0.102 wallclock secs (0.099 usr 0.000 sys 0.099 cpu) @ 9844166.839/s (n=1000000)
      my %: 0.104 wallclock secs (0.101 usr 0.000 sys 0.101 cpu) @ 9642364.694/s (n=1000000)
      hash: 0.126 wallclock secs (0.136 usr 0.000 sys 0.136 cpu) @ 7956588.851/s (n=1000000)
    %(Nil): 2.418 wallclock secs (2.408 usr 0.003 sys 2.411 cpu) @ 413502.511/s (n=1000000)
  %(Empty): 2.718 wallclock secs (2.720 usr 0.006 sys 2.726 cpu) @ 367961.179/s (n=1000000)
  Any.hash: 2.720 wallclock secs (2.668 usr 0.000 sys 2.657 cpu) @ 367586.650/s (n=1000000)
  Hash.new: 2.793 wallclock secs (2.797 usr 0.021 sys 2.818 cpu) @ 357985.530/s (n=1000000)
        {}: 4.083 wallclock secs (4.065 usr 0.009 sys 4.074 cpu) @ 244889.523/s (n=1000000)
     %({}): 4.150 wallclock secs (4.157 usr 0.007 sys 4.164 cpu) @ 240949.921/s (n=1000000)
  Hash(()): 4.641 wallclock secs (4.639 usr 0.016 sys 4.655 cpu) @ 215458.872/s (n=1000000)
Hash(Empty): 5.027 wallclock secs (4.991 usr 0.003 sys 4.994 cpu) @ 198930.114/s (n=1000000)
JJ commented 6 years ago

El sáb., 23 jun. 2018 a las 5:20, Juerd Waalboer (notifications@github.com) escribió:

You'll simply not see code like that in the wild. People should be exposed to code somewhat like they will encounter in the ecosystem.

Well, is the documentation to be descriptive https://en.wikipedia.org/wiki/Linguistic_description or prescriptive https://en.wikipedia.org/wiki/Linguistic_prescription? The existence of a style guide, and the exclusion of { ... } for hashes because it can be confused with blocks, both strongly suggest prescriptivism. But your argument is based on descriptivism. That's a fair point of view, and maybe the example code in the documentation should indeed look like the code in the ecosystem, but in that case the documentation should not shy away from using { ... } for non-empty hashes.

I would say it's prescriptive. In most cases, there's no code "in the wild" to draw from.

JJ commented 6 years ago

El sáb., 23 jun. 2018 a las 7:53, Brent Laabs (notifications@github.com) escribió:

You forgot Hash.new and Hash(Empty) and Hash(()). Hash() returns Hash(Any) which is not what we want. TIMTOWTDI!!!

Benchmark (sorted), because it's interesting:

  my %: 0.104 wallclock secs (0.101 usr 0.000 sys 0.101 cpu) @ 9642364.694/s (n=1000000)

I would say this one is a winner. Maybe use this?

labster commented 6 years ago

I just did at talk at TPC about PHP, so I've been talking about that language a bit lately. One of the things I discussed with people afterwards is that PHP is a better language now, because (among other things) you can use square brackets [] for arrays now, instead of the old array(). So let's recommend everyone use hash() in Perl 6? Um, no.

Everything suggested here is a move backwards in readability and style, including my own previous suggestions. %() is a misfeature, but %( ) works, and it is in the same speed area as the entirely unreadable my %. %( ): 0.102 wallclock secs (0.099 usr 0.000 sys 0.099 cpu) @ 9844166.839/s (n=1000000) The anonymous variables are not really beginner material IMO, and the anon hash is easily confused with modulo.

So I propose {} and %( ) are the best recommendations. Use the space, Luke.

raiph commented 6 years ago

While hash is a sure bet, I think we need to realize that it was always just poor doc that turned this into a problem. I urge folk to consider a return to respectful acceptance of the simple and natural nature of Larry's original DWIM and a recognition that it has successfully taken root.

Please note my latest StackOverflow attempt at formulating the precise {} hash vs block DWIM rule.

I'm going on a trip for a week, but if everyone is happy with it I will be happy to integrate it into the doc a couple weeks from now.

Imo the right way forward for the docs is to show respect for Larry's DWIM. This is consistent with jnthn's perspective and the majority of existing community usage and, I think, preferences, though I acknowledge some strong opinions against it. (I return to this aspect at the end of this comment.)

Perhaps the WAT side of this DWIM belongs on the traps page but I'm unsure. Please read the latest version of my Reddit discussion of this DWIM/WAT that complements my SO answer.

Imo the right way forward for the docs contributor style guide is to recommend use of the DWIM in all cases except where documenting the %(...) alternative and the hash routines. The current guide text contains multiple factual errors and points in the wrong direction imo:

Prefer the %() form of declaring hashes

Imo this should be "Prefer the {} form of declaring hashes".

my %hash := { this => "is", a => "hash" }; # Correct, but BAD my %hash := %( this => "is", a => "hash" ); # GOOD

Imo this is inappropriately "disrespectful" to Larry's DWIM. (I get that the "disrespect" isn't emotive but just trying to be clear.)

Using the second form is more idiomatic

Idiom is natural to a native speaker. I'd argue that's much more {} than %().

and avoids confusion with blocks.

As we know, it unfortunately introduces confusion with %() which won't go away for a few years.

See also my discussion of strange consistency with hash subscripts and ugliness in pair literals in my reddit comment linked above.

This does not apply to empty hashes, which should be declared using {}:

Switching guidance to be use of the DWIM will eliminate this exception.

I understand that samcv and alex and others got burned over this dwim. I think that was down to poor doc. Perhaps my poor original SO answer didn't help. (Hopefully it's not so bad now.) I understand that what I'm saying almost certainly won't resonate for them. Getting burned a few times before the right mental model clicks leaves a lasting impression, especially if the right mental model still hasn't clicked.

But I'm hopeful that Larry's view (which is that the DWIM made sense), and the majority newbie view (which is that the DWIM is fine if no one tells them about it but rather just exposes them to it), and the view of some old hands (jnthn finds the DWIM simple and natural, as do many others including me), and the evident widespread use of this DWIM, and the prospect that improved doc and community understanding will be in place, are persuasive that disrespecting Larry's DWIM isn't the way to go but rather embracing it, at least for now, until we perhaps revisit this in another few years once %() has completed a deprecation cycle.

raiph commented 6 years ago

I posted my prior comment before reading the latest comments about speed. Those look dramatic and perhaps trump everything else. Perhaps {...} can be sped up? How is speed for :key{:a,:b} vs :key(%(:a,:b))?

I still think it makes sense to generally use {} as a hash constructor in docs, when an explicit constructor is needed, because it reads very well. We can document %(...) as being way faster and avoiding the {} hash/block DWIM/WAT though having the %() WAT.

JJ commented 6 years ago

But really, what is WAT? I kind of agree with using {} so I guess that's an actionable thing that can be done right now. Do you still want this issue to be assigned to you to complete it?

AlexDaniel commented 6 years ago

But really, what is WAT?

http://knowyourmeme.com/memes/wat

AlexDaniel commented 6 years ago

Getting burned a few times before the right mental model clicks leaves a lasting impression, especially if the right mental model still hasn't clicked.

I'm not so sure. Looking at your SO answer, it's a huge explanation on when { } creates a block and when it's a hash… I wish we simply didn't need to go into all these details to explain such a basic thing (it should be obvious, not just understandable). With %() that's much easier when it comes to creating hashes, and in the long run the issue with empty %() will be resolved.

That said, most of the cases when I was burned were other way around – a block that I for some reason wanted empty, but got a hash instead ({}). So using %() (which I already do) won't help me.

rafaelschipiura commented 6 years ago

My point is prescriptive, there's no real problem with using {...} for both Blocks and Hashes.

It's a problem that only exists on the minds of people that had their perception warped by computer science. I mean no offense by this, everyone changes their perception because of their background.

This discussion came up some time back and it was decided the current situation was fine, therefore I'm closing the bug with WONTFIX again.

AlexDaniel commented 6 years ago

This discussion came up some time back and it was decided the current situation was fine, therefore I'm closing the bug with WONTFIX again.

Link?

rafaelschipiura commented 6 years ago

@AlexDaniel Sorry, I can't find a link. The only thing I can say is that any seasoned Perl 6 developer is already burned out from this discussion.

So, let me summarize what I think we already have a rough consensus about (if not, just tell):

Hash.new() in the docs will never gonna happen. It's not idiomatic. The {...} Block/Hash constructor DWIM/WAT is fine, no need to change. %(), @(), $() is an anti-feature and will be removed in the future, so the docs should completely avoid using it at all, except to warn people about it.

Now, should the docs use {...} or %(...)? I think the former to expose people to it, which takes away a bit of surprise it might have. There's also no need to go into how it works extensively, I think. Just show it being used and tell people how to get the other result in case the compiler decides to do a WAT. We also don't go into how self-clocking works when explaining terms and operators, we don't go into how Grammars tie into the OO system when explaining it, don't go into Grammars when explaining regexes, and we also don't wonder into the MOP when explaining how to write objects, among many other examples.

rafaelschipiura commented 6 years ago

And if the chosen form is slower, that's a problem for rakudo developers to solve, we should torture the implementation in behalf of the user.

There's no motive for the different forms to take different amounts of time, the compiler is smart enough to know they mean the same thing and to choose the faster implementation.

JJ commented 6 years ago

So, I would say we have several options. I call a vote:

  1. Prescript the {} form for all hashes: use :+1:
  2. Eliminate all kind of prescriptions in the style guide: use :-1:
  3. Leave it just like it is: use :tada:
  4. Use hash as proposed initially: use :heart:

I'll leave this open for 48 hours, tally votes, and do whatever simple majority decides. Any other thing you want to propose, comment here.

Altai-man commented 6 years ago

On regard of my vote, I have spend some time following the conversation(s), so will try to summarize things a little(maybe more for myself than for others though): 1)Magic behavior of %() is set to be removed, yet I have no strong feelings toward using it everywhere: in my humble preferences, {} is just visually neater. 2)I, indeed, had situations before where I was somewhat confused about Block vs Hash situation, but that is not the most horrible thing I've encountered and it took not so much time to fix things. 3)In my opinion, part of us is concerned about user experience in language grammar/semantic, to do it intuitive, yet while it is a great goal to strive for, I'd like to note that such small things exist in every language and it is not the things people usually use to decide on whether to use programming language foo or not. How much is it popular, how many jobs are there, how great ecosystem is, how fast is it, how stable is it, et cetera - you get the idea. Intuitive is a plus for sure, but well, it is IT, people are used to deal with stuff.

Considering the above, my vote is for: 1)Remove prescription for hashes only(I am not sure if it was the indention behind "Eliminate all kind of prescriptions" - this is a bit too extreme imho). As there are quite a lot options to create a hash, I think using most common/plain {} and %() depending on situation/author's style is the best. 2)I don't think we need so much consistency in docs to the extend of "use only One True Form", it is against TimToady. 3)Also, I don't think that performance of some construct should be considered as a rule here. Firstly, it is core team thing to care about various syntax forms for basically same semantics to be equal performance-wise. Secondly, we are describing language, not current implementation. If it happens that tomorrow user Foo will bring out a patch that makes my %foo = WeIrD-way (two spaces) before mah hash: one => 1, two => 2 fastest way to create a hash, lets say 100x faster, will we change all docs to use that form? I doubt that. However, we surely have page named "Performance" - if some folks want more speed, it is up to them to go to that page and read various ways to uglify code to make it faster.

zoffixznet commented 6 years ago

Since {} is considered confusing

By whom exactly? It's a very simple rule and if there's confusion around it, perhaps the rule should be reworded to be clearer.

There are some interesting performance differences, too!

FWIW, there's definitely a bug in there and that performance landscape will change once R#1951 is resolved.

%(), @(), $() is an anti-feature and will be removed in the future

I'm not aware of it being removed. I think you're confusing the coercers with all the magic shortcuts for $/ access; only those will get removed.


The style guide suggests using {} for empty hashes and %() for non-empty hashes.

Why does it need to suggest anything? The docs are written by many programmers, each of whom use different styles, and the language's motto is There's More Than One Way To Do It. It's not overly awful for readers of the docs to be exposed to many features, rather than for the docs to mandate a particular style chosen as the "recommended" one by the 7 people in this discussion.

rafaelschipiura commented 6 years ago

@zoffixznet I meant the shortcuts for $/ will be removed, I wasn't clear.

And in 6.e I think %() should create an empty Hash.

abraxxa commented 6 years ago

Why should it make sense to have so many different forms of defining an empty hash? Perl 5 left the golf era and people realized that readable code is more important than using some super cool, undocumented (mis)feature of the language. Telling newbies that there are over ten ways to do such a basic thing in Perl 6 won't make it popular or loved.

rafaelschipiura commented 6 years ago

@abraxxa No need to tell them.

Juerd commented 6 years ago

@abraxxa That discussion is off topic here. We're discussing the documentation, not Perl 6 itself.

abraxxa commented 6 years ago

Not giving a recommendation is not a good solution.

JJ commented 6 years ago

Well, it's a recommendation for documentation writers. We can still recommend, or not, anything on the place where we talk about hash definition itself. Feel free to check it out and create an issue if it does not look right.