j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
175 stars 14 forks source link

Short-circuiting logical expressions #19

Open certik opened 4 years ago

certik commented 4 years ago

Currently one cannot write things like:

if (i < size(a) .and. a(i) == 0) ...

Because the compiler is free to evaluate a(i) first even if i is out of bounds. This proposal is to introduce .andthen. and .orelse. operators which will allow the above code to be written as:

if (i < size(a) .andthen. a(i) == 0) ...

An alternative proposal to this is conditional expressions (#12) which got rejected.

There is also an option to make .andthen. just .and., the argument pro is that it does not introduce any new syntax, but an argument against it is that it would change the current freedom for compilers to re-arrange logical expressions.

zjibben commented 4 years ago

Another option to consider is adding an optional keyphrase like shortcircuiting on, which might sit next to implicit none. This would require left-to-right short-circuiting behavior in the current scope for all logical operators. The benefit would be we can keep the .and. and .or. already in our codes, as opposed to adding the IMHO verbose .andthen. and .orelse. (which apparently mimics Ada and Pascal). It would also preserve the existing behavior for those who want it.

Personally I'd be happy to go as far as enforcing left-to-right short-circuiting in all logical expressions, which would adopt the behavior of other programming languages like C, C++, Python, Java, Lisp, Perl, JavaScript, Julia, R, Haskell, etc etc. My argument would be for more predictable codes and better programmer control. And, this wouldn't break existing programs because there currently isn't any standards-defined behavior for those programs to rely on (that I know of).

Of course, as you mentioned this is controversial. Some Fortran programmers like that some compilers might rearrange their logical expressions and short-circuit in some optimal-ish order, though other compilers might evaluate the entire statement regardless. And, breaking troublesome if-statements into multiple lines as below is usually not very disruptive. So there is the question of whether this is really worth rocking the boat over.

if (i < size(a)) then
  if (a(i) == 0) ...
end if
certik commented 4 years ago

The way to move this forward would be to gather use cases from real codes, and see how many times the compiler can optimize the logical expression well, versus let the programmer do it explicitly. There might perhaps be a compiler option already to turn this on and off --- in which case one can measure the speed of the code / benchmark it. That would answer the question whether compilers should optimize the logical expression or not.

klausler commented 4 years ago

As an implementor, I think that it would increase portability to our new compiler if we were to always implement short-circuiting in specific cases, viz.: scalar left operand to .AND. and .OR, and scalar selector operand to MERGE when its other two operands have the same rank. Most users are going to be far happier with me if IF (PRESENT(X) .AND. X > 10) doesn't mysteriously crash their code. We are considering documenting some short-circuiting guarantees, but acknowledge that we don't want to encourage the development of less portable Fortran code.

If people want to invent explicit short-circuiting logical operators, I should point out that you don't have to necessarily give them their own new levels in the operator precedence hierarchy; e.g., .ANDTHEN. can have the same precedence as .AND., and would be least confusing if it did so.

gronki commented 4 years ago

I think introducing more and more switches like "shortcircuting on", "implicit save off" etc is a very dangerous practice and should be avoided. There should be one set of rules.

Is the short circuting only a matter of introducing a new operator/function? From what I remember, this issue was brought up like 20 times on c.l.f, but problem was that the language does not guarantee the order or even fact of evaluation of any expression, therefore this one new operator would have to be one exception. Which makes me think that an entire new construct with defined execution order must be introduced to not rewrite half of existing standard. I don't think it should be an operator like .and_then. because when it gets mixed with non short-circuting operators by skilled programmer scientists that will be hell.

I suggest something along the lines of:

! what we want to do
if (present(x) .and. x > 10) x = 10
! warning: requires changing the standard
if (provided(present(x), x > 10)) x = 10
! will crash or not?
if (present(x).and.present(z).and_then.x>z+3) ...

Other option is, since 95% of examples I see are with present and optional arguments, maybe just introduce default value for optional arguments and most complaints will vanish. (In 99% of other cases the order of evaluation actually doesn't matter.)

aradi commented 4 years ago

I'd also prefer, not to have an additional shortcircuiting on keyword. Adding newer and newer keywords like this would lead to having a longish block at the beginning of each module in order to enforce/support 'modern' programming practices:

implicit none(type, external, ...)
shortcircuiting on
implicitsave off
...

I think the message such lines send are disastrous: 'You need to do a lot of gimmics to make sure, that Fortran feels like a modern language'. When teaching Fortran, I feel already ashamed for having to explain, why the implicit none line must be present in each module, and why student projects not having it would be rejected by me. A modern language should support / enforce modern programming techniques by default. I think, we should rather have a collective option to specify a certain language version the code in a given unit represents, which could then turn on all the beneficial options at once. (I actually created issue #83 for that).

sblionel commented 4 years ago

The committee has discussed short-circuiting many times. The sentiment is generally against implicit short-circuiting, as the standard currently allows evaluation of any equivalent expression to any degree of completeness. Requiring short-circuiting would hinder some optimizations.

Instead, WG5 has already approved for the 202X worklist explicit short-circuiting. The current proposal is 18-239.

qolin1 commented 4 years ago

How about a SIF statement? like IF, but impliments short-circuiting.

marshallward commented 4 years ago

I have recently just fixed many short-circuit issues in our codebase which were dormant for years and were only discovered because we recently enabled more aggressive initialization.

I mention this because people are already writing code as if it short circuits, and default behaviour often does not catch it as an issue.

certik commented 4 years ago

Here are the papers that were planned for the February Meeting #155.

https://j3-fortran.org/doc/year/18/18-274.txt https://j3-fortran.org/doc/year/18/18-239.txt https://j3-fortran.org/doc/year/18/18-152.txt

zjibben commented 4 years ago

Note the above papers also discuss conditional expressions, such as a ternary operator which does not evaluate the arguments like merge.

aerosayan commented 9 months ago

Another option to consider is adding an optional keyphrase like shortcircuiting on, which might sit next to implicit none

It might become problematic.

Enabling short circuiting on the top of a subroutine or module pollutes the whole scope, and potentially corrupts the behavior in the whole scope.

I like the proposals of .andthen. and .orelse because they're short, and limited in scope to only a single logical comparison.