nim-lang / RFCs

A repository for your Nim proposals.
136 stars 23 forks source link

RFC: unified quasi-bool handling #95

Closed Bulat-Ziganshin closed 5 years ago

Bulat-Ziganshin commented 6 years ago

We have several PRs (#8358 #8369 #8366) made by @krux02 and @PMunch proposing short-circuit evaluation schemes for Option[T] and other types.

This eventually can lead to zoo of incompatible decisions and once again make the language irregular and thus harder to learn. I can propose two ways to make this feature more orthogonal:

Approach#1 - implement it in generic way:

Each type T that declares itself as quasi-bool should implement toBool(x:T): bool.

proc `or`[T](x,y:T): T = 
    if toBool(x): x; else: y

proc find_first_nonempty[T] (c: Container[T]): Option[T] = 
    for x in c:
        if toBool(x):
            return Some(x)
...

This allows us to define a lot of operations in generic way, define hasToBool concept, and optionally - redefine if/while/... to accept quasi-bools as conditions. Eventually, by defining the single toBool operation, a type will get access to plenty of operations, including ones defined in application code.

And users will need to learn these operations only once and then can apply them to any type with "nil value" concept instead of learning new set of operations for every new type.

In particular, I will be glad if we will copy Haskell operation names on Maybes :)

Approach#2 - define guidelines

Just define guidelines - what extended 'or' and other ex-bool operations should do, and what they cannot do. Go with ad-hoc definitions. And may be later change it to the first approach.

mratsim commented 6 years ago

I'm for approach 2, extending NEP-1 or having a NEP-2 on keywords meaning.

Why not approach 1?

If I have the following BigInt and BigUint types

type
  BigUint = object
    foo: seq[uint]

  BigInt = object
    foo: seq[uint]
    isNegative: bool

I want

proc `or`[T: Bigint or BigUint](x, y: T): T =
   ...

to be bitwise or.

The fact that reset and to a lesser measure == work on all objects is already annoying in certain cases. And I've seen complaints about $ https://github.com/nim-lang/Nim/issues/8023 and https://github.com/nim-lang/Nim/issues/8149.

Having generics procs not constrained to specific types (like seq, SomeInteger ...) should be done with care.

Bulat-Ziganshin commented 6 years ago

I just learned that or is already defined in Nim as bitwise operation for integer types. It's really hard to build on such irregular base :)

So, I feel that it may be better to leave the first approach for its own set of operations such as '&&' or andThen. It's just too late to make generic definition for and/or since they already support multiple usecases - boolean ops, bitwise ops and now we may add quasi-boolean ops.

OTOH, as far as you don't pretend to have toBool(BigInt), defining generic 'or'[T: hasToBool](x,y:T) is still possible without contradicting to 'or'(x,y:BigInt) .

metagn commented 6 years ago

I don't like the idea of toBool as it bites my ass too often in Groovy and most other languages that have || or or or even ?: only check for nilness/definition. Instead, depending on which one you want more, have a nil coalescing operator like ?? or an either nil and empty (if it.isEmpty is defined and is true or if it.len is defined and is 0) coalescing operator like ifempty or whatever.

Bulat-Ziganshin commented 6 years ago

@hlaaftana The point is that Nim already defines

And #8369 will define

So, we have to determine general meaning of and/or (i.e. guidelines) and choose whether we want to implement them in ad-hoc or concept-based way.

OTOH, new operators &&/|| can be defined in 3rd-party library, so don't need any discussion.

Bulat-Ziganshin commented 6 years ago

On the second thought, the two approaches doesn't contradict each other. We should describe guidelinrs in the form of possible implementation, and provide this implementation for types confirming to the concept. So, type may either define toBool and get all the procs implemented automatically, or provide its own implementation. It can also combine both paths, f.e. define toBool(Option[T]) plus non-standard or(Option[T], T).

So, let's try to define guidelines:

Type T defined as Nullable if it can define proc isEmpty(x: T): bool. Such type can define procedures 'or'... which are expected to follow this template:

template `or`[Nullable](x, y: Nullable): Nullable =
  ## When ``x`` isn't empty, then return ``x``, else return
  ## ``y``. Evaluate ``y`` only when necessary.
  let xx = x
  if not xx.isEmpty:
    xx
  else:
    y

Alternatively, it can define proc isEmpty(x: T): bool and import module XXX to get these definitions automatically as well as definition of the following concept:

type Nullable = concept x
    x.isEmpty is bool
Araq commented 6 years ago
Bulat-Ziganshin commented 6 years ago

(Not a proposal)

Using default, we can define and/or/iif for all types with default values:

proc iif[T,U:hasDefault](a:T; b,c:U): U =
  if a!=default(T):
    b
  else
    c

macro or[T:hasDefault](a,b:T): T =
  let x = a
  iif x,x,b

proc and[T:hasDefault](a,b:T): T =
  iif a,b,default(T)
timotheecour commented 6 years ago

/cc @Bulat-Ziganshin what is hasDefault ? is there any type where default(T) as proposed here https://github.com/nim-lang/Nim/issues/8485 would not work?

Bulat-Ziganshin commented 6 years ago

@timotheecour

type hasDefault = concept T:
  default(T) is T

I described how default value can be used to implement and/or for types like Option[T]. But overall, my definition of and/or based on default(T) looks TOOO broad.

timotheecour commented 6 years ago

@Bulat-Ziganshin still same question: is there any type T you can think of where default(T) won't work? if not, hasDefault doesn't really do much.

Any limitations to default would be good to know, for my PR https://github.com/nim-lang/Nim/pull/8490 that introduces default

Bulat-Ziganshin commented 6 years ago

I used hasDefault concept to limit my and/or definitions only to types providing default(T). That's all. So it doesn't show any limits of default(T), only limits of my definition. If you asks because default(T) is supposed to be available for ANY T - sorry, I don't known that.

Araq commented 6 years ago

@timotheecour It's not clear. Nim assumes default(T) exists and yet ref T not nil has no default.

andreaferretti commented 6 years ago

The nice thing about overloading is that you can use it for ad hoc polymorphism (each type can implement an operation in its own way) rather than generic polymorphism (a single definition works for each type). Let each type implement or in the way that makes more sense for the type itself, generalizations like this are just not useful

timotheecour commented 6 years ago

is there any type T you can think of where default(T) won't work

@Bulat-Ziganshin @Bulat-Ziganshin to answer my own question, I just found a type T where const a = default(T) doesn't compile, see https://github.com/nim-lang/Nim/issues/8521 ; however this is (I hope) a bug that can be fixed

Araq commented 6 years ago

Just because default(T) exists doesn't imply that const a = default(T) exists.

PMunch commented 6 years ago

I think a lot of the confusion comes from people thinking of or only as the logic operator. But as pointed out here that isn't true in Nim since it's also the bitwise or (whether or not these are different is another question, bitwise or on two single-bit values would work exactly the same as the logic operator). So we already have a logic operator or and a bitwise or, but those are not all. In the jsffi module we have a or procedure which converts into the JavaScript || operator, which "Returns expr1 if it can be converted to true; otherwise, returns expr2. Thus, when used with Boolean values, || returns true if either operand is true". This is not what either of the previously discussed or operators do. But there is more, or can also be used between typedescs to create a type meta-class. There is also an or in the rstgen module to discard the first of two strings if it is nil. There is even an or in asyncfutures that returns a new future which returns when either of the passed in futures are completed.

So as you can see Nim already uses the or operator for multiple completely different things. And IMO there is nothing wrong with this. All of these uses follow some logical definition of the word or and with a limited set of operators it's the best we can do. The proposed or in the options module is exactly the same as the || found in bash and can be (along with all the previous examples) be explained with the operator or just fine.

Nim has a strong type system and it will quickly warn you if you managed to use any of the above ors in a way that doesn't make sense. In my opinion brevity and conciseness is better than being overly verbose, I'd hate to see Nim devolve into C#/Java territory with things like "CombineTwoFuturesInAnOrLikeManner", or "ReturnFirstOptionIfHasValueOtherwiseSecond". As for what to do about the diversity of the or (and other operators) I'd say we should just make sure that none of them does something completely illogical, maybe create a nice little table showing the different ones, and let editor support and compiler errors pick up the slack for inexperienced users.

drslump commented 6 years ago

So as you can see Nim already uses the or operator for multiple completely different things. And IMO there is nothing wrong with this. All of these uses follow some logical definition of the word or and with a limited set of operators it's the best we can do.

While I get the overall feeling of the comment and I'm well aware of the huge differences between Nim and other less flexible languages, it's not the same to overload operators in a given project or library because they match perfectly with the domain than including those overloads in the language itself. In Nim it's great that we have the option to make very expressive and concise code, but in my opinion that's simply a tool that users should opt-in to use or not, the language itself should strive to be as minimal and regular as possible otherwise its surface is too big and full of pitfalls for newcomers.

I'm really liking the language but in its current state I wouldn't push for its usage at work, not because it doesn't have a 1.0 release but because its cognitive load feels huge and while I'm willing to put the hours to master it I can't impose that at the work place. I think that's a problem for the language adoption in the future, it's not the number of features that matter but how much information you need to have in your head to work with the language confidently.

Araq commented 6 years ago

I think that's a problem for the language adoption in the future, it's not the number of features that matter but how much information you need to have in your head to work with the language confidently.

As I keep asking ... what to remove from Nim?

Bulat-Ziganshin commented 6 years ago

@Araq may be and/or/xor/not on integer types

But generally speaking, the problem is non-orthogonality of Nim features which reveals the absence of preliminary language planning. People just implement small features they need today, making Nim extremely practical, but built out of small independent features - opposite to f.e. Haskell which is built from few larger ones.

Araq commented 6 years ago

Now that's somewhat offensive, Nim definitely was planned but it's always easy to do better in hindsight. And the bitwise operators were taken from Delphi where nobody complained about them. It's just that Nim is now more popular than Delphi, I guess/hope.

drslump commented 6 years ago

... what to remove from Nim?

Everything that is not streamlined. Partially agree with @Bulat-Ziganshin, non-orthogonal features are very hard to explain... But sometimes they are needed! they just should be gated in my opinion so they are not abused and keep the eyes open to see if a new feature could be generalised to support that previous functionality.

A simple thing that would make the language much more accessible is to avoid boasting about its meta programming capabilities. It's really cool what the language can do but I would only expose a limited version of template by default, all the other machinery can be used internally by the compiler/stdlib to build other features but it should be gated so only experienced developers go for it when actually required. On its simpler form that's just documenting those features in a separate manual and not much more.

Another easy thing is to focus on a single way to format strings, that's like the first feature used after a hello world example, there should be a standard way to do it and the rest can be deprecated or gated if they offer some advantage in some cases.

Anyway, I don't want to hijack the thread, let me get more experience with the language and I'll provide a more thorough answer in a while.

andreaferretti commented 6 years ago

@drslump Uh? The metaprogramming features of Nim are the very reason most users attracted to it in the first place!

drslump commented 6 years ago

@andreaferretti indeed! I'm also in that boat 😄

I'm not saying that they should be removed, they are best in class! just saying that someone that hasn't done any meta-programming before shouldn't be exposed to it on its first week using Nim, and she should be productive even if not using those features. As a library author (open source or at work), once you've gathered more experience with the language then you can safely explore its full meta-programming potential and more importantly, you'll probably have had acquired the knowledge to expose that potential cleanly.

Bulat-Ziganshin commented 6 years ago

@Araq: Nim definitely was planned

So we can blame you :) Anyway, Nim has a lot of small practical features, and IMHO it is the most practical language on the market. But gluing multiple small features together into larger topics require careful language design work, which you not performed.

I'm trying to recollect examples of unorthogonality, but come only with two small ones - different syntax for the same things in proc f[HERE](AND THERE) - I think it should be fixed by allowing to write optional "type" in square brackets - proc f[t: type...](). And "seq[int]" vs "ref int" syntax - may be allow both "seq int" and "ref[int]" too? I.e. allow to write any "Generic[subtype]" as "Generic subtype" too.

Actually I miss "array[0..9,0..9] of int" syntax, so may be even allow to write "Generic[x1,x2,x3,subtype]" as "Generic[x1,x2,x3] of subtype" with optional "of"?


About and/or: AFAIR, their duality was introduced right in the original Pascal. Then Ada replaced boolean ops with "and then" and "or else", emphasizing their short-circuit evaluation.

Nim goes even father - it short-circuits boolean ops AND supports ops on user-defined types. Now, you can only guess whether errcode1 or errcode2 will be short-circuited or not. So, these two differences compared to original Pascal made using the same operator name objectionable.

Araq commented 6 years ago

But gluing multiple small features together into larger topics require careful language design work, which you not performed.

The only reason Nim doesn't disallow to "overload" these operators further is because Nim in general doesn't restrict what system.nim can do too. But sure, let's use this one nitpick edge case to discret my work.

PMunch commented 6 years ago

Yeah I have to disagree with you here @Bulat-Ziganshin and @drslump. One of the major selling points of Nim is it's flexibility. Being able to tailor the language to the task at hand really is one of the key features that pulls people in to using Nim. And macros, as with any powerful tool, must be used with caution. Of course they can cause confusion and seem intimidating to new users. But at the same time the best macros are things you don't even notice. Things like the %* macro in the JSON module which makes it super easy to work with JSON in Nim. Nim doesn't have to be written explicitly for a certain task since every programmer can use the same tools and build upon the language to create a language that works for them. And for new users it's probably less confusing to have a clean and readable syntax that might do some magic in the background for you, but in the end does what you would expect.

I can agree that the stdlib sometimes has some inconsistencies and quirks. And there are multiple open issues on making things more coherent across the various modules. As for this issue I think the important thing is just to know that or and and is not always boolean or and and but could rather be different things. This could possibly be added somewhere to a manual page, but I think 90% of wrong uses would be picked up by the type system so I don't think it's a big issue.

krux02 commented 5 years ago

I am closing this as rejected. Proposal 1 is rejected, because generics that are not constrainted to specific types for common operator will cause problems. They will be matched when they should not and error messages that were simple become complicated. Like mratsim said:

Having generics procs not constrained to specific types (like seq, SomeInteger ...) should be done with care.

The guidelines could be added, but but since they are not ever formulated here I don't see a reason to keep this RFC alive.