hoaproject / Central

Hoa is a modular, extensible, and structured set of PHP libraries.
https://hoa-project.net/
Other
104 stars 8 forks source link

New library: Hoa\Option #56

Closed Hywan closed 7 years ago

Hywan commented 7 years ago

Hello @hoaproject/hoackers,

I am proposing a new RFC to introduce a new library, called Hoa\Option.

Introduction

The Option sum type is very powerful. In some languages, like Rust or Haskell, it replaces the null value, and thus it increases the safety of the language itself.

The definition of Option is the following:

enum Option<T> {
    Some(T),
    None
}

You must read it like: Either we have some value T, or we have nothing, aka null.

This new library will be very helpful if designed correctly. The PHP type system is yet poor and fragile, but we can still provide more safety with such a type. For instance, in the Hoa\Socket library, we have to deal with a lot of null value, in expected or unexpected locations. Having an Option in PHP 7 (so with declared types) would provide a better safety. How? First by having a real type forcing us to deal with unexpected situation, and second by automatically panicing/throwing a RuntimeException. It can be caught elegantly if it happens, unlike a null value that will just be silent until a bad usage of it.

The pros of an Option in PHP is the API it provides to deal with null value. Not only to replace null and is_null(), but to really chain operations on it without taking care if the value is null or not.

Design of the API

In PHP, we do not have the same expressiveness that we can have in Rust, Haskell, or Caml for instance. Nevertheless, we have powerful mechanisms like autoloaders, import etc., to have our own approach.

Having the value wrapped in a type (Option) and inaccessible from the outside will force us to write the code differently. We will no longer pass values directly, but we will extract them from the option before passing them to another function. So for instance, let's pretend f is a function that can return a nullable value, then this actual code:

$x = f();
$y = g($x);

could be rewritten as:

$x = f();

if ($x->isSome()) {
    $y = g($x->unwrap());
}

What to do if $x->isSome() is false? We should initialize $y to null? It sounds like we are postponing the problem. The Option::expect API is helpful in this particular case: Unwrap the value is some, else “panic” i.e. throw a RuntimeException for instance, thus:

$x = f();
$y = g($x->expect('Damn, x is null!'));

Write code differently

This library will definitively force us to write our code differently. We will use less variables, more edge cases/error cases will be handled correctly and very well identified.

Proposed API

I really like the API from Rust: std::option::Option:

Cloning

Whether Option will be clonable or not must be strictly defined.

Serializing

Whether Option will be serializable or not must strictly defined.

Iterating

It might be possible to iterate over a value. It will introduce a strong dependency to Hoa\Iterator. It would be possible to compute a simple \Iterator object, but we will loose some benefits when writing code (need to instanciate new Hoa\Iterator objects manually for instance, less clear code). Maybe we should do nothing for a first step.

Magic methods

__isset(): bool could be an alias to isSome(): bool.

Mutability

Option will be immutable. No possibility to get a mutable reference with the current API. Could be introduce later with smart mechanisms.

Instanciating

The most powerful API I have in mind yet is the following:

$x = Option::some($value);
$y = Option::none();

We can automatically import some and none as function alias to respectively Option::some and Option::none in the global scope, so:

$x = some($value);
$y = none();

Just writing none instead of none() implies that none is a constant, so all instances will be identical. Since an option is immutable, it can make sense. It will save some memory allocations, which will be good.

Nullable type

PHP has nullable type, like ?Foo, which means either the null value, or a value of kind Foo. This is somewhat similar to Option<Foo>, but with no API, except is_null(mixed $value): bool to see if $value is null or not, which is equivalent to Option::isSome(): bool. However, Option has a much richer API, with unwrap, and, or, except, map etc. This is “null with superpowers”. Unfortunately, ?Foo is a type, while Option<Foo> cannot be expressed in PHP yet, so we could not have such a type.

Concrete implementation

Not completely designed yet. First idea: Having an Option class, with 2 interfaces: None and Some. That would allow to write such code:

if ($x instanceof Some) {
    …
}

will be strictly equivalent to:

if ($x->isSome()) {
    …
}

Note we could not change the implemented interface on-the-fly, except with Reflection, but it comes with a cost. This might not be a good idea.

Second idea, just a single Option class, with the mentionned API. It must have as few attributes as possible to reduce the size of Option. This library will add a runtime cost, it must be as much insignificant as possible.

Conclusion

What do you think? I assume you will have a lot of questions about this RFC. I cannot address all of them ahead of time. I could add a FAQ, but I prefer to get fresh and unoriented feedbacks first. A FAQ can be added in the future.

Thanks!

Pierozi commented 7 years ago

I like the idea, it's really close to immutable ValueObject concept. In this case you have two Vo, Some and None.

I will think a bit about the implementation and give you a feedback.

Hywan commented 7 years ago

@Pierozi Thanks for the feedback. What about the concept? Do you think it will improve safety? What about public API: Do you think users will know how to deal with Hoa\Option instead of a simple nullable value? Not to mention that public API must not return a nullable value as much as possible…

jubianchi commented 7 years ago

I really like everything you propose except for one thing: I don't think integrating the iterator in Option is good.

We want to deal with the presence (or absence) of valie not how we want to iterate over it. To me, constructing thz iterator, if necessary, should be delegated to the option consumer.

Moreover how would you choose which kind of iterator to produce?

Hywan commented 7 years ago

@jubianchi Exactly. We agree on this point. This is a feature we could introduce in the future with a new RFC I guess.

Hywan commented 7 years ago

I got a very concrete example today. A collection owns several products, but, new thing, a product can have no collection. So $product->collection() can return null. So calling $product->collection()->title() is now unsafe. Two solutions; first without an Option:

$title = $product->hasCollection() ? $product->collection()->title() : null;

$title can be a string or null.

Or, with an Option returned by collection():

$title = $product->collection()->map(function (Collection $collection) { return $collection->title(); });

$title is an Option.

If https://wiki.php.net/rfc/arrow_functions or another RFC of the same kind gets accepted, we could write it like this:

$title = $product->collection()->map(|Collection $collection| { return $collection->title(); });

If we omit the type-hint for Collection, we get:

$title = $product->collection()->map(|$collection| { return $collection->title });

Much better from my point of view! The null/empty case is correctly handled all the way long! No need for an extra hasCollection method; it can be replaced by $product->collection()->isSome(), the API is much simpler and safer.

jubianchi commented 7 years ago

@Hywan great example! 👍

With short arrow function syntax, the code would be very clear :

$title = $product->collection()->map($collection => $collection->title);

Too bad we don't have them in PHP...

But because PHP will always allow us (developers) to do weird/wrong things, I would keep the type-hint in the closure to make everything even safer:

$title = $product->collection()->map(function (Collection $collection) { return $collection->title(); });

IMHO this is the way to go. Properties are not strictly typed so anyone extending the product class can break the collection() method, making it return something inappropriate (OK, with PHP7 we should enable strict_types everywhere, but it can be disabled 😢 ). The type-hint will guarantee we actually get a Collection ensuring the call to Collection::title() is right.

vonglasow commented 7 years ago

LGTM 👍

Pierozi commented 7 years ago

My only concern of this design is to lead us to have Option object everywhere who embed the real value. and this Option object hide the real type of value behind where you are not able anymore to typehint.

I would prefer to go only with an Interface free to implement instead of generic Option object.

Hywan commented 7 years ago

@Pierozi The goal is to avoid having null value I think. But yes, it will hide the type of the output, until PHP got generic (so Option<MyClass> will work). What is your proposal with an interface?

mathroc commented 7 years ago

I'm with @Pierozi on this one, I would love to have Option in php but not having generics is a major drawback:

the only way (as awful as it might be) to mitigate this I can think of would be to autogenerate classes based on the name, eg \Hoa\Option\for\Psr\Http\Message\ResponseInterface that would be the equivalent of \Hoa\Option<Psr\Http\Message\ResponseInterface>

It might help catch error earlier but I don't think this can help IDEs or static analysers (might be worst in fact 😕)

With the existing generics RFC, if \Hoa\Option happens before generics. when/if php get generics, would there be an easy path forward ? I guess it would work if we can declare a class like this:

namespace Hoa;
class Option<T = mixed> {
  public function unwrapOr(T $fallback): T {/* ... */}
}
Pierozi commented 7 years ago

@Hywan as quick example, i've something like this in my mind. https://gist.github.com/Pierozi/78ccb81525ea5a29729a8518d63903f0

Hywan commented 7 years ago

@nikic Hello :-). Do you think we have a chance to have generics inside PHP soon? Thanks!

Hywan commented 7 years ago

@Pierozi The problem with an interface is that we have to implement all methods, or provides a trait… hmm, with a trait we will reduce the amount of bugs, that could be a nice alternative, but we still can't write:

function f(): MyObject + Option {
}

But OK, if MyObject implements an Option, then the IDE will be able to infer the methods. But this is not very clear :-/.

Thought?

Pierozi commented 7 years ago

Trait only work when you know the definition of object, in this case the property value, but that's depend of the object. And yes i recognize Interface are not very helpful just help to provide standardization of the Some/None in this case.

php limitation :/

Hywan commented 7 years ago

@Pierozi You can do this:

class MyObject implements Option
{
    use OptionTrait;
}

And you have an option 😄.

Pierozi commented 7 years ago

Not really because your trait define arbitrary a property to use and the assertions to valid a data. in my example the local property are named value and the assertion is true if property are not null and not empty string.

In my point of view, trait should only be a pure function, not deal with object property. because an interface cannot expose property, only method

jubianchi commented 7 years ago

IMHO, Option is not something functions/methods should consume through their arguments, especially in PHP where we can't type-hint on generics (yet).

Option will avoid unexpected null usage (aka NPE in java for example). If a function/method expects a nullable argument it has to deal with it explicitly. If it does not expect a null, typehints are here to make things safe.

So, an example to show my POV:

function foo(int a) : Option/*<int>*/
{
    if ($a > 0) return Option::some($a);

    return Option::None;
}

function bar(int $a = null) : int
{
    if (null === $a) return 0;

    return $a * 2;
}

$a = rand(0, PHP_INT_MAX); // $a is an int
$b = foo($a); // $b is an Option<int>
$c = bar($a->unwrapOr(null));

Using this is, again IMHO, the only way of ensuring we get the right arguments at the right place. Without generics, if bar accepted an option, it should do something like:

function bar(Option $a) {
    $aa = $a->unwrapOr(0);

    if (is_int($a) === false) {
        throw new \InvalidArgumentException();
    }

    return $aa;
}

Here we are doing half the work of PHP, i.e checking arguments type. Not mentioning that IDE will never help us in identifying argument requirements because they will all expect an Option.

Another question : how to deal with optional arguments? Using an None option ? no way for me. This will lead to such calls everywhere:

$c = bar(Option::None);

To sum-up my POV: Options are a really good thing and it's a good idea to add them in Hoa but we should not try to force their usage everywhere and convert every null to Options. null is not a good thing, I agree, but it's here and we have to deal with it because PHP is not prepared to work as if null did not exist.

@Hywan you said

until PHP got generic (so Option will work).

If this ever happen, we will have to drop support for several PHP version to be able to use it fully.

My 2 💵 😉

Hywan commented 7 years ago

I agree with @jubianchi. We should not use Option everytime. Also, it will force us to write code differently. For example, an optional argument can accept null, this is correct, but a mandatory argument must not receive null. Most of the time, accepting null is a bad practise. So our code will must be more robust I think.

<?php

function f(int $x): int
{
    return $x << 2;
}

f(2); // 4
f(null); // fails because null is not an integer
f(); // fails because an argument is missing

This illustrates my claim. If an argument does not accept null, it will not accept Option, this is a type error, basta (except with nullable types).

In Rust for instance, this is very rare to accept Option in argument, except for optional argument (with Into etc.). Having an Option as argument is… strange.

The power of Option is when you want to make operations on a value, like:

$product->collection()->map($c => $c->title())->map($t => $t->translate('fr_FR'));

instead of:

$translation = null;
$collection = $product->collection();

if (null !== $collection) {
    $title = $collection->title();

    if (null !== $title) {
        $translation = $title->translate('fr_FR');
    }
}
Hywan commented 7 years ago

Now, the question remains: How to be compatible with IDE? I don't use an IDE, so I can't answer this answer :-(.

jubianchi commented 7 years ago

@Hywan unfortunately, IDEs will probably have problem inferring option values. But is it really a big problem ?

Today, with PHP and IDE, we can't be sure and get hints on collections for example: a function expects a collection (let's say a SplObjectStorage) we can't be sure of the items' type it expects and the one it gets.

The problem exists and we can't deal with it today... no generics support in the language and no generics support in PHPDoc annotation.

I just tested with PHPStorm (2016.2):

<?php

class Option {}

/**
 * @return Option<int>
 */
function foo(Option $a) : Option
{
    return $a;
}

$b = foo(new Option);

PHPStorm tells me $b is an Option, that's all. I does not read the generic part of the @return annotation.

IMHO, IDE are a "fake" problem: we can do everything we want to make the code self-documented and provide IDE with what they need, developers will always have to read the library source code at some point to know what they should provide or what they should expect.

mathroc commented 7 years ago

@Hywan unfortunately, IDEs will probably have problem inferring option values. But is it really a big problem ?

I wouldn't mind about IDEs, it might be a bit more annoying for static analyser (eg: scrutinizer, phan). but maybe they can be taught to read generic types in docblock

Hywan commented 7 years ago

So I guess we can move this RFC as ready?

Hywan commented 7 years ago

Hello fellow @hoaproject/hoackers!

I have created the new library Hoa\Option, https://github.com/hoaproject/Option. Please, review it before releasing the 1.0 version.

If no feedback before the 21st August 2017, I will release the 1.0 version.

Take care of the documentation, the PHP 7.1 requirement, the build matrix on Travis etc. please.

mathroc commented 7 years ago

nice ☺

just a one things after reading the README

is there a difference between $a->mapOr($map, $or) and $a->map($map)->or($or) ? same for all *Or*() methods, and if there’s no differences, are they really useful ?

mathroc commented 7 years ago

note: I half expected something using along the line of Finite-State Machine as a Type System illustrated with a store product with an Option interface and 2 classes Some & None but then I don’t how to prevent someone else to implement the Option interface

Hywan commented 7 years ago

You're correct, mapOr(…) is equivalent to map(…)->or(…) except that we have 2 calls and 1 more allocation with map(…)->or(…). So mapOr(…) is a convenient shortcut. However, wrapOr(…) is not similar to wrap(…)->or(…) because wrap does not return an Option.

I propose to keep the mapOr method, thought?

Hywan commented 7 years ago

About the article you mentionned, this is not applicable here because an option is a monad, all the methods return a new option, there is no particular state (with a PHP implementation).

Hywan commented 7 years ago

https://github.com/hoaproject/Option has been released. I had to update hoa/consistency before that. Now it's time to update every library :-).