hoaproject / Central

Hoa is a modular, extensible, and structured set of PHP libraries.
https://hoa-project.net/
Other
104 stars 8 forks source link

Enhanced Hoa\Iterator API #55

Open Hywan opened 7 years ago

Hywan commented 7 years ago

Hello fellow @hoaproject/hoackers and users!

This RFC aims at enhancing the Hoa\Iterator API.

Introduction

PHP has a lot of iterators, but the API for a daily usage is quite limited from my point of view. Other languages, like Rust, define nice and powerful API (see Trait std::iter::Iterator).

Goals

Let's start by an example:

$hasEven = false;

foreach ($data as $datum) {
    if (0 === $item % 2) {
        $hasEven = true;

        break;
    }
}

can be rewritten:

$hasEven = (new Hoa\Iterator\Map($data))->any(function ($item) { return 0 === $item % 2; })

With the PHP RFC about short function notation, we will have something like:

$hasEven = (new Hoa\Iterator\Map($data))->any($item => 0 === $item % 2);

The former is harder to read and to understand, and it is much more error-prone. Moreover, this is harder to chain with another operations, like a filter or a map just before having the any.

Goals are:

  1. Less error-prone,
  2. Easier to read and to write,
  3. Avoid iterator invalidations as most as possible,
  4. Allow safe iterator mutation if needed,
  5. Uniform and simple API to replace most of the (our) foreach loops,
  6. Better performances than foreach.

Vocabulary

This is important to agree on a vocabulary. An iterator iterates over a collection of items.

In the case of PHP, the items are heterogeneous, i.e. they can have different types. This is defined by the collection type.

Proposed API

Basis:

In a perfect world, next would be defined as next(): Option<mixed>, and thus valid could be dropped, but let's fit in the current PHP API.

More:

Bounded vs. unbounded iterators

What to do if the iterator is unbounded?

Producer-consumer model

Put in other words, all the API is lazy. It means that:

$iterator->map(…)->filter(…)->collect();

Will not be equivalent to:

$mapped = $iterator->map(…);
$filtered = new Iterator\…($mapped)->filter(…);
$out = new Iterator\…($filtered)->collect();

But it will be much more like this:

foreach ($iterator as $item) {
    $mappedItem = $mapCallable($item);

    if (true !== $filterCallable($mappedItem)) {
        continue;
    }

    $out[] = $mappedItem;
}

So, when we describe the map API as map(Callable): Iterator, this is wrong. It would more accurate to describe it as map(Callable): Map, where Map is a special Iterator.

A nice effect is that:

$iterator->map(…)->filter(…);

will execute nothing. Why, Because map and filter are producers, not consumers. However, count, collect, fold etc. are consumers.

Another name for this pattern is the “adapter pattern”.

Outro

Most of the API can re-use the work we did with existing Hoa\Iterator classes. Most of them are extending PHP SPL. However, we can re-implement everything from scratch with generators, thanks to generator delegation. That's my strategy.

Thoughts?

Edits:

Hywan commented 7 years ago

Note: This is somewhat very similar to nikic/iter cc @nikic if you have a feedback about your library (would you do something differently for instance? any performance issues?).

Main difference is that we are going to be full object instead of being functional.

mathroc commented 7 years ago

this seems nice, it's always a PITA to come working with iterator in php back from rust (or javascript with lodash)

shouldn't those return Options ? (not really sure about the last 2)

Hywan commented 7 years ago

@mathroc I guess sum and product must return 0 if the collection is empty. But max and min must return an option, correct! Them for find & co. I am fixing it. Thanks for the detailed look!

mathroc commented 7 years ago

I'm ok for sum returning 0, however I'm not sure about product, I think it should return 1 (the identity element)

so that $a->product() * $b->product() === $a->chain($b)->product() is always true (as it is for sum)

Hywan commented 7 years ago

You're correct! I reckon I should sleep more…