Open certik opened 4 years ago
Which of these do people find the most readable and least readable?
My own feeling (from most readable to least readable):
We should also consider more use cases. I also asked at https://fortran-lang.discourse.group/t/202x-feature-conditional-expressions/329 to get more feedback on this feature.
To be honest I find both variants suboptimal (the first one would make an if/else/endif block return a value when being used inline, and simply run code as usual when not, but would be easier to read) and I wonder whether there is really a need for general conditional expressions or whether a ternary operator would be sufficient. In the latter case I would suggest the Python-based syntax (which one could even nest):
CALL SUB(A, B, C, D IF PRESENT(D) ELSE (EPSILON(X) IF (X < 1) ELSE SPACING(X)))
@dev-zero in your opinion, can you rate by readability / preference the three options above plus your proposed Python-based syntax?
The ternary operator (that you presumably like) is the "? syntax" which is one of the proposed ideas, but you also say they are "suboptimal", so I am confused.
From most to least readable:
Caveat: I use 2 in Python a lot so I'm used to it, thus my preference.
FWIW my preference from most to least readable is:
Aside the reduced readability, IMHO I cannot see any practical advantage in the new proposed expressions.
My preference in terms of readability:
As for why I dislike the ?
syntax and the difference to the Python-inspired operator:
A general start and end marker ?
for conditional expression would to me indicate that more complex conditional expressions should be possible at some point, with the only condition that all codepaths inside it have to return a compatible type, but that is with the current proposal not the case (nor should it be). On the other hand does the nesting as shown in the example above actually create (at least visually) an alias for ELSE IF
: the :?
, since the expression should actually be:
CALL SUB(A, B, C, ? (PRESENT(D)) D : ? (X < 1) EPSILON(X) : SPACING(X) ? ?)
correct? With the second to last ?
omitted? Which I would find even less readable.
I think the readability question might slightly be missing the point. The question isn't about how easy is it to see what the code is doing, but how easy is it to see the code's intent.
To me, an if-else
block says the intent of the code is to do different things based on some conditions. The new feature allows one to specify the intent "this variable (or argument) depends on some condition(s)" more clearly.
To use the example from above, if I see
IF (PRESENT(D)) THEN
CALL SUB(A,B,C,D)
ELSE IF (X<1) THEN
CALL SUB(A,B,C,EPSILON(X))
ELSE
CALL SUB(A,B,C,SPACING(X))
END IF
It's not immediately obvious (especially in codes that are more complicated than this example) that we are definitely going to be calling sub
in all cases, and the only difference is the value of the last argument. Whereas with
call SUB(A, B, C, IF (PRESENT(D)) THEN (D) ELSE IF (X < 1) THEN (EPSILON(X)) ELSE (SPACING(X) END IF)
that intent is put front and center without having to compare each branch in an entire if-else
block.
So, my preferences would be for
If the idea is to call SUB
just once, to make the intent obvious, as @everythingfunctional correctly points out, then here is the alternative to the original:
IF (PRESENT(D)) THEN
D2 = D
ELSE IF (X<1) THEN
D2 = EPSILON(X)
ELSE
D2 = SPACING(X)
END IF
CALL SUB(A,B,C,D2)
Which I personally find more readable than:
CALL SUB(A, B, C, IF (PRESENT(D) THEN D ELSE IF (X < 1) THEN EPSILON(X) ELSE SPACING(X) END IF)
But it is true you have to declare an extra variable (although the performance of the code should be identical with a good compiler).
Is a subset of this a good candidate for a stdlib function if_then_else(condition, expression1, expression2)
?
Is a subset of this a good candidate for a stdlib function
if_then_else(condition, expression1, expression2)
?
It would be, but merge
already would be sufficient, and the idea of the proposal is to not evaluate the unused expression (which merge
can't do).
@certik , while you've removed the difficulty of seeing "do we always call SUB
in every case", you've now introduced the ambiguity of "do we always assign to D2
in every case?" You're back to the exact same problem. You're using a multi-statement block to perform a single logical operation. A reader must inspect and evaluate every branch (the block as a whole) to understand it is performing a single logical operation.
My understanding is that this feature was approved by WG5 for inclusion into 202X based on a survey, where people expressed a wish to have conditional expressions in Fortran, but I don't think there were concrete proposals how it would look like, just that the feature would be nice to have. @sblionel is that an accurate statement?
@everythingfunctional explained well the main argument for conditional expressions is that they enforce a single logical operation to be assigned somewhere (as the result of the conditional expression).
I agree that at this level of "requirements" it seems like a good idea and I am not against that (I use it sometimes in Python, although rarely; I never use the ?
notation in C or C++). But when it comes down to syntax and how it would actually look like in Fortran, it seems less readable in practice to many people, and that is would should count. If a feature looks good in abstract terms, but does not look good in concrete (syntax) terms, in my opinion we should not put it in until it looks good both in abstract and concrete terms.
A prior compiler implementation of this would be very helpful, so that we can play with it more, before putting it into the language.
Adopting ?:
for a conditional expression but using a difference syntax from those languages which already use it for the same purpose (i.e. C, C++, Java, Javascript), is a bad idea and will only lead to confusion. As someone who uses multiple languages, I would appreciate it if Fortran did not do anything weird here and go off the beaten path for no apparent reason.
I forgot: why aren't we just fixing MERGE()
to guarantee non-evaluation of the operand that isn't selected?
@klausler the arguments that have been put forth against fixing merge
are:
MERGE is elemental. Conditional expressions are not.
Conditional expressions are a top-level single selection, so we can lift all the dynamic requirements that merge
needs
MERGE requires TSOURCE and FSOURCE to have the "same type and type parameters"
I don't personally understand the arguments, but I think @everythingfunctional does? Brad, can you summarize here why we cannot extend merge
? If we could just extend merge
, that would be the best way forward I think.
My understanding is that this feature was approved by WG5 for inclusion into 202X based on a survey, where people expressed a wish to have conditional expressions in Fortran, but I don't think there were concrete proposals how it would look like, just that the feature would be nice to have. @sblionel is that an accurate statement?
Yes. The way we work is that WG5 outlines the general idea and J3 develops that into a specific proposal.
merge
is elemental, and as such must evaluate both expressions to determine the shape of the result. The example is, if one argument to merge is a scalar and the other is an array, even if the scalar is the selected value, the array expression must still be evaluated to determine the shape of the result, because the result will still be an array with that shape, just with each value equal to the given scalar. The proposed conditional expressions would not be capable of that, because each expression must have the same rank.
@klausler , I actually proposed just that on the J3 discussion board, but Malcolm was able to give me convincing reasons against, along the lines of my previous comment.
@everythingfunctional cannot the shape in merge
be determined at compile time for both arguments?
But MERGE()
could guarantee evaluation of at most one of its first two arguments in the case where the third argument is scalar and the first two arguments have the same rank, yes?
@everythingfunctional cannot the shape in
merge
be determined at compile time for both arguments?
Rank, yes, apart from assumed-rank dummy arguments, but not shape.
But
MERGE()
could guarantee evaluation of at most one of its first two arguments in the case where the third argument is scalar and the first two arguments have the same rank, yes?
I suppose that's true. In fact, one could go a small step further and state that the "unused" argument is evaluated only if necessary to determine the resultant shape. I.e., if the array argument is selected, the scalar argument wouldn't need to be evaluated.
However, that still wouldn't satisfy one of the use cases (although it's not one I find particularly compelling); conditionally supplying an optional argument. I think the proposed conditional expressions provide a clean way to provide that, with a convenient place to put the desired deferred evaluation functionality.
conditionally supplying an optional argument
This use case should by done by #22, that seems like a cleaner solution anyway.
So extending merge
and implementing #22, we might be able to cover this feature.
I'd prefer to see something like
select
case (present(d))
call sub(a,b,c,d)
case (i<n) then case (a(i)==0)
call sub(a,b,c,epsilon(x))
case default
call sub(a,b,c,spacing(x))
end select
Or possibly use switch
instead of select
For example, say if x < 0.1
I don't want to provide an argument to some procedure, how would merge be sufficient? What would you provide as the other argument? I.e.
call sub(a, merge(???, x, x < 0.1))
@everythingfunctional how would that be written using conditional expressions? I don't think that's possible either, or am I missing something?
The proposal specifically allows for the else
part to be omitted in cases of passing to optional arguments. So the example would be (in the case of the keyword syntax)
call sub(a, if (x >= 0.1) then (x))
I see, I didn't realize that. Btw, I think the syntax is:
call sub(a, if (x >= 0.1) then (x) end if)
or
call sub(a, if (x >= 0.1) then x end if)
But adding parentheses around x
actually makes it more readable to me.
Well, we can have some kind of syntax or keyword for merge, such as call sub(a, merge(*, x, x < 0.1))
or something like that, but I am not sure I like it.
The proposal specifically allows for the
else
part to be omitted in cases of passing to optional arguments. So the example would be (in the case of the keyword syntax)call sub(a, if (x >= 0.1) then (x))
The syntax (x)
is an idiomatic way to turn a variable into an expression. Will (x)
above be a variable? If so, would one need ((x))
to pass a copy of x
as an expression?
Maybe the concept of a conditionally-passed optional argument and a general conditional expression should be separated.
Oops, yeah I forgot the end if
in my example.
I believe the parentheses are present in all of the examples in the syntax paper. I think they may be necessary to disambiguate in fixed-form syntax.
Btw. this:
call sub(a, if (x >= 0.1) then (x) end if)
Doesn't look that bad compared to the original:
if (x >= 0.1) then
call sub(a, x)
else
call sub(a)
end if
I still slightly prefer the original in this case, but perhaps for some more complicated case I would sometimes use the conditional expression.
Did not have time to read all the comments so this might have been covered but the closest I can think of using current standard Fortran would be
CALL SUB(A,B,C,MERGE(D, MERGE(EPSILON(X),SPACING(X),X<1.0), PRESENT(D)))
In a hopefully working example that would be
program whatif
implicit none
call testit(0.0)
call testit(1000.0,100.0)
call testit(10.0)
contains
subroutine testit(x,d)
real,intent(in) :: x
real,intent(in),optional :: d
real :: a,b,c
CALL SUB(A,B,C,MERGE(D, MERGE(EPSILON(X),SPACING(X),X<1.0), PRESENT(D)))
write(*,*)a,b,c,merge(d,-99999.0,present(d))
end subroutine testit
end program whatif
subroutine sub(aa,bb,cc,dd)
real,intent(out) :: aa,bb,cc
real,intent(in) :: dd
write(*,*)'SUB SEES=',dd
aa=dd;bb=2*dd;cc=dd**2
end subroutine sub
Results:
SUB SEES= 1.19209290E-07
1.19209290E-07 2.38418579E-07 1.42108547E-14 -99999.0000
SUB SEES= 100.000000
100.000000 200.000000 10000.0000 100.000000
SUB SEES= 9.53674316E-07
9.53674316E-07 1.90734863E-06 9.09494702E-13 -99999.0000
So using that as a comparison and looking at the limitations it has in the new syntax if the fourth parameter were metamorphic or if the routine was generic with different types allowed for D would all the expressions have to be of the same type or not?
Would everything always be evaluated or would it short-circuit? One of the issues with the above statement would be if D/2 instead of D were passed to SUB() and D was not present evaluating D/2 would be problematic, as MERGE is allowed to evaluate all the expressions.
That being said I would prefer something less C-like that short-ciruits with something closer to functional syntax like
call sub(a,b,c, if(present(d) <= d ;x<1.0 <=epsilon(x); spacing(x)) )
where the semi-colon is intentional and means to short-circuit and where it looks like a function called IF that has expressions of the form condition <= expression to use and the last field which has no condition would be the default and if there were no last expression it would be the same as not present. maybe a name of USE() instead of IF() would be good. The C syntax is one of my least-favorite syntaxs in any language.
so my preference would be 1) original 2) something other than either of the proposals
how about
call sub(a,b,c,IF(present(d) USE d; x<1 USE epsilon(x); spacing(x) ) )
in case the parameter does not have INTENT(IN) this could then be used like
myvar=IF(PRESENT(d) USE d; x<1 USE EPSILON(x); SPACING(x) )
call sub(a,b,c,myvar)
I have implemented a prototype of this into LFortran: https://gitlab.com/lfortran/lfortran/-/merge_requests/645/
Example notebook: https://nbviewer.jupyter.org/urls/gitlab.com/lfortran/lfortran/-/raw/477cac615ca420377c7582f16294d05461bf7b8d/share/lfortran/nb/Conditional%20Expressions.ipynb
It doesn't implement the whole proposal, but it implements enough to be able to play with it.
Two more syntax forms have been proposed.
Mixed form:
IF ( logical-expression-1 ) ? expression-1
[ : ( logical-expression-k ) ? expression-k ] ...
: ( expression-n ) ENDIF
examples:
if (i>0 .and. i<=size(a)) ? a(i) : (present(val)) ? val : 0.0 endif
CALL SUB(A, B, C, IF (PRESENT(D) ? D : (X < 1) ? EPSILON(X) : SPACING(X) END IF)
Arrow form:
( predicate -> consequent
[ : predicate -> consequent ] ...
: alternative )
Examples:
( abs(residual)<=tol -> 'ok' : 'did not converge' )
( i>0 .and. i<=size(a) -> a(i) : present(val) -> val : 0.0 )
CALL SUB(A, B, C, ( PRESENT(D) -> D : (X < 1) -> EPSILON(X) : SPACING(X) ))
Another suggestion is to amend the "arrow" form to simply use ?
as in C (motivation: if the syntax is so close to C and C++, we should make is exactly the same, to prevent confusion):
C syntax:
Examples:
( abs(residual)<=tol ? 'ok' : 'did not converge' )
( i>0 .and. i<=size(a) ? a(i) : present(val) ? val : 0.0 )
CALL SUB(A, B, C, ( PRESENT(D) ? D : (X < 1) ? EPSILON(X) : SPACING(X) ))
The committee discussed yesterday why merge
will not work, so I want to summarize it here:
merge
has other uses that would not all work for conditional expression usage, the main one is that merge
is elemental. The solution could be to use a new "function" such as ifthen
.ifthen
using a function like syntax has the following drawbacks:
ifthen(.., ifthen(.., ifthen()), ..)
ifthen
would look like a function, but does not work like a function, as some arguments will not be evaluated. ifthen
at this point becomes a new syntax with new semantics that is not function like. We might as well introduce new syntax that does not look like a function call, such as all the alternatives under consideration above.If the sub-clauses in an if were allowed to return a value - a bit like a lambda function in C++ you could have
call sub(a, b, c, if (present(d)) then d; elseif (x > 0.1); x; else; endif)
Perhaps clearer with different brackets:
call sub(a, b, c, if {present(d)} then d; elseif {x > 0.1}; x; else; endif)
Readability seems to be a problem with all the options so the second is probably preferable.
Did anyone considered the python's ternary syntax?
call sub(a, b, c, d if (present(d)) else (epsilon(x) if (x > 0.1) else spacing(x)))
dropping the parenthesis as they aren't needed in this case:
call sub(a, b, c, d if present(d) else epsilon(x) if x > 0.1 else spacing(x))
conditional-expr: consequent IF predicate ELSE (alternative | conditional-expr)
I pretty sure it has at least been mentioned, but I believe there was some resistance by compiler writers to syntax that requires look-ahead (or back-tracking) parsers. It's also not particularly "Fortranic".
I pretty sure it has at least been mentioned, but I believe there was some resistance by compiler writers to syntax that requires look-ahead (or back-tracking) parsers. It's also not particularly "Fortranic".
It was mentioned here by @dev-zero, indeed, but I don't think it got the deserved attention. It's way more "Fortranic" than '?' operator (from C family) IMHO, without the boilerplate of the original proposal:
then
then
can be omitted if there's only one consequence in if statement, in the current standard it's called logical IF.end if
I can't say anything about the writers preferences, but I must say in a recent experience I've failed to write this kind of nested pattern in order to parse a simple key-value file xD.
<conditional-expr> ::= <then-expr> IF <scalar-logical-expr> ELSE (<else-expr> / <conditional-expr>)
Is XIFLELSE(1.)
a "conditional expression" involving variables X and L or is it a reference to an external function?
Is
XIFLELSE(1.)
a "conditional expression" involving variables X and L or is it a reference to an external function?
So you are saying that the fortran tokenizer get rid of all whitespace... hmpf that would be a problem indeed :/
X IF L ELSE (1.)
turns into XIFLELSE(1.)
which is ambiguous (I think I figured out where I've failed in my VDF now xD)
Is
XIFLELSE(1.)
a "conditional expression" involving variables X and L or is it a reference to an external function?
In my opinion, with fixed-form source having been made obsolescent, at some point we should be allowed to stop worrying about how new features can expressed without significant whitespace. And so, XIFLELSE(1.)
is a function call, and X IF L ELSE (1.)
is a conditional expression. Unless there's some other reason I'm missing.
It was discussed at the October 2020 Fortran call that another good use case for this feature would be for array initializers. In Python you can do (I corrected the syntax, thanks to @ivan-pi's comment below):
[x+1 if x >= 45 else x+5 for x in range(1, N+1)]
So in Fortran you could do:
[ (if (x >= 45) then x+1 else x+5 endif, x = 1, N) ]
Since conditional expressions are just expressions, this should just work.
In Python you can do:
[x+1 for x in range(1,N+1) if x >= 45 else x+5]
This triggers a SyntaxError in Python. The correct way would be:
[x + 1 if x > 45 else x+5 for x in range(1,N+1)]
So the syntax is
expression_if_true if condition else expression_if_false
Edit: following Python if expression syntax, the example from above would be:
CALL SUB(A, B, C, D IF PRESENT(D) ELSE (EPSILON(X) IF X < 1 ELSE SPACING(X)))
It seems fairly nice, you just need to remember the condition is in the middle.
Edit2: Oops, I just noticed the posts from @14NGiestas above.
New syntax paper being proposed at the June 2021 J3 Committee meeting:
New syntax paper being proposed at the June 2021 J3 Committee meeting:
What a mess! Tell me again how this is supposed to be better than a MERGE()
intrinsic with stronger non-evaluation guarantees, please.
@klausler thanks for the feedback. I personally think we should not do this feature at all (i.e. NO on the 21-157 proposal), as the two syntaxes (keyword and ?) seem worse than not doing this (based on the feedback both online above, as well as in private that I got). We can pursue the merge()
idea and see if we can come up with a proposal.
@klausler thanks for the feedback. I personally think we should not do this feature at all (i.e. NO on the 21-157 proposal), as the two syntaxes (keyword and ?) seem worse than not doing this (based on the feedback both online above, as well as in private that I got). We can pursue the
merge()
idea and see if we can come up with a proposal.
I'm sure we've talked about this. In short, strengthen MERGE()
so that when its mask (3rd) argument is scalar, and the TSOURCE&FSOURCE arguments have the same type and rank, it guarantees that exactly one of its TSOURCE/FSOURCE arguments are evaluated, and returns the appropriate value. Or define a new intrinsic function with these guarantees.
The idea of modifying MERGE was discussed several meetings ago (I can't find which one). I liked the idea, but there were complaints that it would slow down MERGE for everyone and it failed.
Relevant papers:
Taking the second example from the
18-274
paper:One proposed syntax is "keyword syntax":
The second proposed syntax is "? syntax":