Raku / problem-solving

🦋 Problem Solving, a repo for handling problems that require review, deliberation and possibly debate
Artistic License 2.0
70 stars 16 forks source link

Feature Request - method ```.hammer``` #431

Closed librasteve closed 4 months ago

librasteve commented 5 months ago

I would like to test initial response to a new possible method, namely

.hammer and its relative .hammermap(&block)

This method recursively hammers flat nested lists (and forcably flattens any itemized lists and Array elements) and is guaranteed to produce a flat list.

Does anyone thing this is a good idea?

I think that this is needed since, from time-to-time on the raku-beginner channel, someone will ask ("why doesn't my list go flat when I use @l.flat. While raku is intended to protect nested structure by default so that the structure will survive being be passed around a chain of functions, this is not always the desired behaviour. Indeed the docs mention that this default can irk users of data.

In contrast to @l.flat, @l.hammer will give the desired "simplistic" behaviour of "just give it to me flat". This slightly eases the difficulty level of not following the default, but a slightly comedy name hopefully retains the hint that this is slightly undesired vs. keeping the structure.

The documents currently say:

This can irk users of data you provide if you have deeply nested Arrays where they want flat data. Currently they have to deeply map the structure by hand to undo the nesting:

say gather [0, [(1, 2), [3, 4]], $(5, 6)].deepmap: *.take; # OUTPUT: «(0 1 2 3 4 5 6)␤»

lizmat commented 5 months ago

I think .hammer is a good idea.

I'm not sure .hammermap is a good idea: is that the equivalent of .hammer.map(&block), or .map(&block).hammer (as .flatmap(&block) currently has the semantics of .map(&block).map).

lizmat commented 5 months ago

FWIW, I've cobbled an implementation of .hammer together:

    method hammer() {
        my class HammerIterator does Iterator {
            has $!iterator;
            has $!next;

            method new($iterator) {
                my $self := nqp::create(self);
                nqp::bindattr($self,HammerIterator,'$!next',nqp::list);
                nqp::p6bindattrinvres(
                  $self,HammerIterator,'$!iterator',$iterator
                )
            }

            method pull-one() {
                nqp::while(
                  nqp::eqaddr((my $pulled := $!iterator.pull-one),IterationEnd),
                  nqp::if(
                    nqp::elems($!next),
                    ($!iterator := nqp::pop($!next)),
                    (return IterationEnd)
                  )
                );

                nqp::if(
                  nqp::istype($pulled,Iterable),
                  nqp::stmts(
                    nqp::push($!next,$!iterator),
                    ($!iterator := $pulled.iterator),
                    self.pull-one
                  ),
                  $pulled
                )
            }
        }
        Seq.new: HammerIterator.new(self.iterator)
    }
lizmat commented 5 months ago

Alternately, maybe .flatshould get a :levels argument, which would allow you to specify the number of levels deep you'd want to flatten. With :levels(*) and :levels(Inf) to get .hammer semantics.

lizmat commented 5 months ago

UPDATE: actually, that would be a bad idea, as .flat doesn't flatten anything that has been itemized.

So maybe we should give .flat two more things:

  1. a positional indicating the number of levels (defaulting to *)
  2. a named argument :hammer that would disregard containers on flattening
ab5tract commented 5 months ago

I think @lizmat has the right approach here.

I agree with @librasteve that the functionality is clearly useful but I believe the right place for a user to be looking for it is in 'flat'.

I don't like the idea of adding a new 'map' though. It seems to me that it only saves a single period and is also ambiguous, as has been pointed out.

lizmat commented 5 months ago

Now as a PR: https://github.com/rakudo/rakudo/pull/5594

librasteve commented 5 months ago

that's perfect - brilliant!

lizmat commented 5 months ago

Re-opening as the discussion is not over yet!

lizmat commented 5 months ago

Discussion on #raku-dev: https://irclogs.raku.org/raku-dev/2024-06-11.html#13:07

librasteve commented 5 months ago

OK - i see the concerns which I think deserve an evidence based response:

from Discord channels...

[16:19] Raku bridge: [4] > (@a.List Z @b.List).map: *.flat [16:19] Raku bridge: ((0 1 8) (2 3 9)) ## ok, but whaa?

[00:19]Nemokosch: oh damn, not this again... [00:19]Nemokosch: why even have flat if it "respects" containers, and why do containers "itemize"... [00:20]shadow: >>.List.flat seems to work

[16:44]Nemokosch: unfortunately, if you look into the output of (1...3, (5...7)), the result of (5...7) is so much treated as an element that it will disobey even .flat

Admittedly this is only a couple of folks (plus me) - but it not nothing ;-)

ab5tract commented 5 months ago

I recall that this has come up periodically on the channel quite often, actually. We can't know how many people have bounced off of Raku solely as a result of this, but it seems realistic to assert that the number is non-zero.

Regarding the objections raised on IRC: The idea that a user would easily or intuitively reach for a combination of 'tree', 'xx', and 'flat' feels relatively outlandish to me.

Why shouldn't flat have optional arguments to accomplish a variety of flattenings?

jubilatious1 commented 4 months ago

Sorry, I see this as an extension of indexing/flattening problems noted here:

https://github.com/rakudo/rakudo/issues/1966#issuecomment-2140809337

https://github.com/Raku/problem-solving/issues/407

Not sure a newbie is going to reach for a .hammer if that's not a common programming term, i.e. used in multiple different programming languages?

Additionally/alternatively, is there a symbolic notation that accomplishes the same thing, for example, "double-squarebracket" notation?

[0] > my @a = 1,3,5
[1 3 5]
[1] > my @b = 2,4,6
[2 4 6]
[2] > (@a.List Z @b.List).map: *.flat
((1 2) (3 4) (5 6))

[3] > my @c = 10...20
[4] > say @c[ (@a.List Z @b.List).map: *.flat ] #really cool result
((11 12) (13 14) (15 16))
[5] > say  @c[ (@a.List Z @b.List) >>->> 1 ].Array #really cool result
[(10 11) (12 13) (14 15)]

[5] > # right now this works like so:
[5]  > say  @c[[ (@a.List Z @b.List) >>->> 1 ]].Array
[12 12 12]

[6] > # but what if we could get it to work (i.e. flatten) like this instead?
[6] > say @c[[ (@a.List Z @b.List) >>->> 1 ]].Array #wished for result
[10 11 12 13 14 15]
raiph commented 4 months ago

I would much rather see all the elements of the relevant original design completed unless there's consensus that some elements are the wrong way to go.

In this case that means implementing [**] as the hammer being described in this issue.

I see multiple merits to that design, including it being nicely consistent with the existing N level hammer syntax (eg [*;*;*] for N=3).

raiph commented 4 months ago

@jubliatious1

is there a symbolic notation that ["hammers"]?

Yes. A consensus was established around 2 decades ago for a [**] subscript to fully flatten (hammer) any positional left hand side. For example:

say @c[ (@a Z @b) >>->> 1 ][**]'; # [10 11 12 13 14 15]            # "should" work right now

Other ways that do work right now in current Rakudo include:

say [@c[ flat (@a Z @b) >>->> 1 ]]; #[10 11 12 13 14 15]                # right now 
say [@c[(@a Z @b) >>->> 1 ][*;*]]; #[10 11 12 13 14 15]                # right now 

What [**] brings to the table is "infinite" flattening (aka "hammering").

lizmat commented 4 months ago

FWIW, I hadn't realized this. This should be trivial to implement now as a shortcut to .flat(:hammer). In the 2024.08 release.

jubilatious1 commented 4 months ago

Sounds pretty amazing!

I'm still on ancient Rakudo™ v2023.05., and I see:

[4] > say @c[ (@a Z @b) >>->> 1 ][**];
HyperWhatever in array index not yet implemented. Sorry.
  in block <unit> at <unknown file> line 1

The other ones work just as @raiph has stated:

[4] > say [@c[ flat (@a Z @b) >>->> 1 ]];
[10 11 12 13 14 15]
[4] > say [@c[(@a Z @b) >>->> 1 ][*;*]];
[10 11 12 13 14 15]
[4] >
lizmat commented 4 months ago

@raiph could you provide a link to the consensus re [**] ?

See also https://github.com/rakudo/rakudo/pull/5609

raiph commented 3 months ago

@lizmat

First, thanks for acting on what I raised / creating the PR.

Something I hope is at least slightly helpful for tonight is current (old) design doc verbiage:

The HyperWhatever Type

A variant of * is the ** term, which is of type HyperWhatever. It is generally understood to be a multidimension form of * when that makes sense. [...] Therefore @array[^**] represents @array[{ map { ^* }, @_ }], that is to say, every element of the array, no matter how many dimensions. (However, `@array[]means the same thing because ... the subscript operator will interpret bare` as meaning all the subscripts [...]

In general a Whatever should be interpreted as maximizing the degrees of freedom in a dwimmy way, not as a nihilistic "don't care anymore--just shoot me".

(See also discussion of * indexing in S09, and think about the above in that context.)

The above verbiage, which arrived sometime during the 2000s, documents where the design ended up rather than any discussion preceding that point. Presuming the earlier discussion is (more) important then I'll have to dig another time because I'm recalling it from memory, and my initial quick and rough searches of the most likely emailing list perl.perl6.language (eg "[**]" site:https://www.nntp.perl.org/group/perl.perl6.language.data) look like a bust (at least in the time I've allocated for tonight; I'm traveling and have another trip tomorrow and will likely be busy for at least the next few days). That said, as far as I know the nntp archives are complete, so I'm confident that, given time, I will be able to find some of the discussions I recall.

(There may well have been discussions on IRC too that were about use of flattening. But I suspect those would have come after I think there was clear consensus on the need for / desirability of a sledgehammer, and a subscript sledgehammer being a thing to have, and ** being the sledgehammer symbol.)

raiph commented 3 months ago

(BTW, I don't recall anyone calling it hammer or sledgehammer. I'm just sticking with steve's nomenclature. In the past I've called it a steamroller or bulldozer. I recall thinking of suggesting the mnemonic that [**] looks a bit like a steamroller or bulldozer, though I also recall thinking it was a weak idea so perhaps didn't actually suggest it.)