Open Leonard-Reuter opened 4 years ago
This is a very common request, so we should have an issue for it and document why such ideas were not implemented yet.
The last time I have seen this discussed is with this proposal: https://j3-fortran.org/doc/year/19/19-111r1.txt. It actually has reason why it was rejected.
(The +=
and *=
would also imply to have /=
, but that is already taken for .ne.
. The 19-111r1
paper thus introduces a new operator that could do the same.)
This is a very common request, so we should have an issue for it and document why such ideas were not implemented yet.
The last time I have seen this discussed is with this proposal: https://j3-fortran.org/doc/year/19/19-111r1.txt. It actually has reason why it was rejected.
(The
+=
and*=
would also imply to have/=
, but that is already taken for.ne.
. The19-111r1
paper thus introduces a new operator that could do the same.)
I think, just doing the two 'positive' operators would already help a lot. '-=' and '/=' are much less needed and can be circumvented like this: The hypothetical code:
a = 1.0_real64
do i=1, 10
a -= vector(i)
end do
Can be written as:
a = 1.0_real64
a *= -1
do i=1, 10
a += vector(i)
end do
a *= -1
And analogous for division (maybe in line with the loop statements, to make the purpose clearer?):
a = 1.0_real64
a = 1/a; do i=1, 10
a *= vector(i)
end do; a = 1/a
While this seems messy, it prevents the 'space ship operator' from being valid:
a -=- 1 ! this gives the same as 'a += 1'
The main point however should be, that '-=' and '/=' are much less needed. I would really advise against another operator for '/='. That would be quite counter-intuitive.
PS: since the referred paper was in parts rejected due to a missing use case, I will add one to the original post =)
Good point: the proposal would be to only add +=
and *=
. I would say the +=
is probably the most common in practice. I grepped one of my codes for i = i + 1
and I use just this particular string 7 times. I use j = j + 1
8 times, k = k + 1
3 times, etc., and that's just a small subset of all usages for +=
. So having +=
would simplify dozens of places in my own code for sure.
The important aspect of BCPL's assignment operator(s) is that the left-hand side of the assignment is evaluated only once; e.g., if Fortran were to acquire such a feature, in A(F()) += X
, there would only be one call made to F
.
The important aspect of BCPL's assignment operator(s) is that the left-hand side of the assignment is evaluated only once; e.g., if Fortran were to acquire such a feature, in
A(F()) += X
, there would only be one call made toF
.
I referenced your point in the original post.
Additionally, this feature should specifically state whether the augmented assignment is atomic or reducing or disallowed in some circumstances (shared variable in DO CONCURRENT
, &/or coindexed references).
I'd suggest also adding **=
to have the 'positive' augmented assignment operators complete.
For once I agree with the committee for not adding something. 😃 I think these variants are best left to C-based languages. I don't think they are very "Fortranic". Aesthetically I've always thought they were weird, and I don't think we really gain much by having them (except new confusion about /=
).
@Libavius wrote:
.. The hypothetical code:
a = 1.0_real64 do i=1, 10 a -= vector(i) end do
Can be written as:
a = 1.0_real64 a *= -1 do i=1, 10 a += vector(i) end do a *= -1
The above "hypothetical code" can be written using current Fortran standard as:
a = 1.0_real64 - sum(vector)
In fact you can write it as
a = 1 - sum(vector)
But I think @Libavius's point was to show how it can be used. A typical loop where I use i = i + 1
is:
i = 1
j = 1
do j = 1, size(nl)
n = nl(j)
zeta = zl(j)
l = ll(j)
do m = -l, l
nlist(i) = n
zetalist(i) = zeta
llist(i) = l
mlist(i) = m
i = i + 1
end do
end do
Where it is kept as a running index.
I am leaning towards @jacobwilliams's comment above (https://github.com/j3-fortran/fortran_proposals/issues/113#issuecomment-566687141) that i = i + 1
looks great and even though i += 1
is shorter, it is a new syntax, and so it might not be worth it in this case.
@FortranFan I know, thanks anyway for the hint.
@certik that was my point indeed. I also agree, that i = i + 1
looks good, however this is not always the case.
A more elaborate use case:
type Molecule_t
type(Atom_t), allocatable :: atoms(:)
contains
procedure, pass :: Get_electrons_number => Molecule_t_get_electrons_number
end type Molecule_t
contains
function Molecule_t_get_electrons_number(this) result(electrons_number)
class(Molecule_t), intent(in) :: this
integer :: electrons_number
integer :: i
electrons_number = 0
if (ALLOCATED(this%atoms) then
do i=1, SIZE(this%atoms)
electrons_number = electrons_number + this%atoms(i)%get_electrons_number()
end do
end if
end subroutine Molecule_t_get_electrons_number
The loop would–in my opinion–be easier readable like this:
do i=1, SIZE(this%atoms)
electrons_number += this%atoms(i)%get_electrons_number()
end do
@FortranFan I could write this with the SUM
intrinsic (but this would be slower, since an array would need to be allocated):
electrons_number = SUM([( this%atoms(i)%get_electrons_number(), i=1, SIZE(this%atoms) )])
EDIT: (or easier, but still with allocation, if Get_electrons_number
is elemental
electrons_number = SUM(this%atoms%get_electrons_number() )
This is in addition to the possible reduction of function calls that @klausler mentioned.
@jacobwilliams jup, the confusion with /=
ist quite painful and probably the strongest counter-argument...
@Libavius for your latest use case, I would recommend to change get_electrons_number()
to not be a function, but just a member variable. In fact I would recommend to change Molecule_t
to:
type Molecule_t
real(dp), allocatable :: electrons_number(:)
...
contains
procedure, pass :: Get_electrons_number => Molecule_t_get_electrons_number
end type Molecule_t
And then change the loop to just:
electrons_number = sum(this%electrons_number)
@certik But then, if I change my atoms array I always have to be careful and update electrons_number as well.
The idea in this case would be, that a molecule has atoms but an atom can also exist without a molecule. To avoid code duplication I don't want the molecule to have a member 'electrons_number' or implement atom-related functions again but rather iterate over its atoms.
The way I do this in my electronic structure codes is that I do not have an Atom type at all. So if your molecule has only one atom, then the length of all the arrays will be just 1. There is no code duplication and in fact things run much faster.
@septcolor I've also suggested something like these be included in Fortran (#30 ). I feel they are useful because they allow you to expose derived type components, safe in the knowledge that if you need to refactor the data storage layout you can always add a getter in future. They are thus extermely useful for encapsulation, which @certik's approach breaks.
@cmacmackin indeed, the approach I suggested is flat and exposed. It's not encapsulated. It has the huge advantage that it works well with current Fortran, and typically optimizes really well and runs fast, and it's actually typically simpler in terms of lines of code (as demonstrated above). The disadvantage is that if you change the underlying representation, you need to rework your whole code. So with my approach, one has to design things well, but one can use the current Fortran language. For the encapsulated approach, Fortran does not have the best tools currently.
As for the syntax, what about Espen Myklebusts proposal in #80, using the @
operator representing the LHS of an assignment? So @septcolor s example could be writen as
my_data(idat) % its_sub_obj(iobj) % another_sub_obj(isubobj) = @ + 1
It would not cause any conflicts with current operators, so it could be also used for division (I would find it really strange, if *=
were introduced but in place division won't work...). And it had the benefit, that the @
operator (representing the LHS) could appear even more than once on the RHS, as in
array(i,j)%comp1(k,l)%value1 = abs(@ - sqrt(@))
(example by Espen Myklebust).
@aradi is your idea different to the proposal I linked at https://github.com/j3-fortran/fortran_proposals/issues/113#issuecomment-566223095, which was rejected?
@certik Yes, Espen Myklebusts proposal has in my opinion a considerably lower complexity, as the one you have linked in. The @
operator would always represent the entire LHS, while in your proposal it can also represent a part of it (as in a ( s @ (3*i+1) ) = b(s)
). It does not involve naming, so no conflicts with eventual implicit statements must be feared and there is no need to introduce a new (statement level) scope.
@aradi thanks for the clarification. (For the record the 19-111r1
proposal is not my proposal, I just linked to it.)
As for the syntax, what about Espen Myklebusts proposal in #80, using the
@
operator representing the LHS of an assignment? So @septcolor s example could be writen asmy_data(idat) % its_sub_obj(iobj) % another_sub_obj(isubobj) = @ + 1
It would not cause any conflicts with current operators, so it could be also used for division (I would find it really strange, if
*=
were introduced but in place division won't work...). And it had the benefit, that the@
operator (representing the LHS) could appear even more than once on the RHS, as inarray(i,j)%comp1(k,l)%value1 = abs(@ - sqrt(@))
(example by Espen Myklebust).
I think this'd be superb elegant =)
EDIT: After some time, I'm not so sure anymore. It is also a bit of syntactic overload...
I like the idea of using a symbol in the right-hand side to denote the current value of the left-hand side, but @
is accepted as a legal character in identifiers as an extension in many compilers (as is $
as well). #
would conflict with preprocessing directives if it were used and it appeared at the beginning of a continuation line. '*' would be very Fortran-like but would lead to ambiguity (X=Y*****Z
), as would /
. So that leaves us %
and ?
, I think.
I think the reasons outlined why augmented assignment probably won't work out are true. But I would be in favor of the "pronoun" proposal that @certik mentioned. There are a variety of languages starting to support this kind of feature (most commonly in languages that have a pattern matching feature). One might think of this as basically syntactic sugar for an associate block with only a single line in it.
For example:
v @ x(3*i+1) = v * 42 + b(i)
would be directly equivalent to:
associate(v => x(3*i+1))
v = v * 42 + b(i)
end associate
So this could basically be treated as syntactic sugar, not really a new functionality.
I think the reasons outlined why augmented assignment probably won't work out are true. But I would be in favor of the "pronoun" proposal that @certik mentioned. There are a variety of languages starting to support this kind of feature (most commonly in languages that have a pattern matching feature). One might think of this as basically syntactic sugar for an associate block with only a single line in it.
ASSOCIATE()
could be extended slightly to accommodate a single statement as well as a block.
Perhaps, it is too late to discuss this but somehow I have not seen anyone mentioning another important aspect of the compound/augmented assignment: It allows to perform more efficient in-place operations, such as big_array(:, :, :) += big_array2(:, :, :)
, without the risk of creating temporaries. This makes it more than just a syntactic sugar.
Another similar example is that if we have a type like this
type :: BigObject
real(dp) :: values(10000, 10000)
end type
type(BigObject) :: a, b
and there is a frequent need to do operations like this -- a % values = a % values + b % values
, then a naive overloading of operator (+) would result in an extremely inefficient code. It would be great if one has a possibility to overload something like (+=) [with obvious call semantics] to implement heavy in-place operations.
There is a renewed discussion at the J3 mailinglist about this feature.
I created a prototype implementation in a compiler: https://gitlab.com/lfortran/lfortran/-/merge_requests/1286
ASSOCIATE can't be used to form a mutable association with every variable. Variables with vector-valued subscripts, and coindexed references, become immutable copies of expressions if they appear as "right-hand" sides of selectors in ASSOCIATE & al. If you want to use ASSOCIATE to get a mutable abbreviation of the variable of an assignment-stmt, it won't always work.
I prefer the idea of using a symbol in the expr of assignment-stmt to denote (a copy of) the variable, e.g. "var = @ + expr". It would work with any operation (including those spelled like .AND.
) and it allows repetition, use of the functions (e.g., var=transpose(@)
), and the syntax doesn't look like it implies atomicity. (If @
can't be used, there's other symbols that would work syntactically.)
Ada extends the notion of '@' so that you can do things like this:
c(i,j) = cmplx(real(@) + 1.0d0, aimag(@) * 3.0d0, real64)
The '@' can stand-in for the entity on LHS as many times as needed and as a sub-expression on the RHS.
So, this generalizes what you could achieve with operator assignments like +=.
PS: It's required that the expression that is bound via '@' is evaluated only once, so that it's clear how many times functions and the likes are evaluated then they occur multiple times as part of a '@'.
Thanks @klausler and @mjklemm for the comments!
I do quite like the idea of a symbol for a shorthand for the lhs. I feel almost certain that using @
for that would conflict somehow with its use as the rank-agnostic subscript feature, but I can't quite put my finger on exactly how. Hopefully we can come up with some other agreeable symbol. $
perhaps? I don't believe it's been used other than in format specifications, but I could be mistaken.
Some further thought about the use of a special symbol (say ..
here) to represent the entire previous value of the left-hand side of an assignment-stmt is leading me to think that it would be a worthwhile feature to consider.
+=
, as x = .. + y
.ints(:) = permute(..)
), function reference actual arguments, and even designators.ASSOCIATE
can't handle (viz., vector-valued subscripts & coindexed variables).+=
to imply atomicity to the naive reader.x = (.. * y) + z
).Another discussion of this proposal: https://fortran-lang.discourse.group/t/updating-assignment-operators/2129
First of all, I think that the idea (originally by Espen Myklebusts, I suppose) with a LHS designator instead of dozens new operators is great and offers a syntactically simple and flexible way of enhancing assignments. I could only suggest to use an existing Fortran association syntax to avoid introducing additional symbols. This way, the combined assignment-operation would look like:
(lhs => container % array(:, :, i)) = lhs + fun(lhs)
(c => complex_array(i, j)) = cmplx(real(c) + 1.0_real64, aimag(c) * 3, kind=real64)
etc. This is essentially a more "fortranic" development of the idea by @everythingfunctional .
Such a construction can be considered as a one-line ASSOCIATE block, but, as @klausler has remarked, the restrictions inherent to the ASSOCIATE statement can be considerably relaxed in the new construction. Typical restrictions of the assignment statement should rather be applied.
The advantage of such a syntax is that it does not introduce any new lexer elements and does not seem to create a clash with existing constructions.
To me, the precise goal of the augmented assignment, as discussed here, is still unclear. Should it be simple syntactic sugar to abbreviate code (which would be for sure useful), or should it represent new operators, which one can also extend for user defined types (useful when designing efficient OOP-based abstract calculus concepts)?
To me just +=
and *=
would be the most useful. The extra operators seem quite complicated to see what is going on.
A couple of points (from someone interested in having either +=
style operators or any kind of shorthand notation for assignments of the form lhs = lhs .op. foo
):
+=
, *=
, etc. in other languages, there is an obvious issue with /=
in Fortran that does not seem to be resolvable in an elegant way (introducing another symbol for division or skipping it entirely cannot be considered as elegant). This leads to a well-motivated opposition from the Committee to such operators, qualifying them as "non-fortranic" (Steve Lionel expressed this explicitly on several occasions). Furthermore, introducing new operators is not encouraged because technically Fortran already allows a user to define infinitely many custom operators by means of the dot notation.@opeil @certik
In order to resolve the issue with \=
how about adding a dot in between the operator and "=":
This would give:
+.=
, -.=
, *.=
, /.=
, **.=
Furthermore it can be applied to any custom operator .op.
as .op.=
(dropping one dot)
I don't know, if this is 'fortranic' enough, but it would resolve the issue with /=
and the dot would be borrowing the operator dot notation. On the other hand, it would still be very much intelligible for any reader.
- All constructs considered here can be considered as syntactic sugar.
That is not the case with C's augmented assignment operators, not is it true with the proposals above.
2.. That said, they also change the semantics of the operation, sending a clear message to the compiler that the operation should be done in-place if possible. The usual semantics of Fortran implies that the RHS is first evaluated and then assigned, which by default requires a temporary storage.
This is also not the case. Instances where an augmented assignment statement might require use of a temporary would also be in need of a temporary if written without the augmentation.
- Inasmuch as I myself like to use augmented operators
+=
,*=
, etc. in other languages, there is an obvious issue with/=
in Fortran that does not seem to be resolvable in an elegant way (introducing another symbol for division or skipping it entirely cannot be considered as elegant). This leads to a well-motivated opposition from the Committee to such operators, qualifying them as "non-fortranic" (Steve Lionel expressed this explicitly on several occasions). Furthermore, introducing new operators is not encouraged because technically Fortran already allows a user to define infinitely many custom operators by means of the dot notation.
Fortran allows new user-defined operators but not new assignment statement symbols.
The most general proposal for augmented assignments -- namely, the introduction of a symbol to represent the previous value of the left-hand side of the assignment -- avoids the problems with /=
.
The most general proposal for augmented assignments -- namely, the introduction of a symbol to represent the previous value of the left-hand side of the assignment -- avoids the problems with
/=
.
I like a lot the idea of having a placeholder for the lhs, as this would abbreviate long expressions in many cases. But then, it would be still syntactic sugar only, as programmers won't be able to extend it for their own types in order to provide efficient in-place implementations for them. (As one would be able to do by overriding the +=
operator in C++ or the __iadd__
method in Python...)
The most general proposal for augmented assignments -- namely, the introduction of a symbol to represent the previous value of the left-hand side of the assignment -- avoids the problems with
/=
.I like a lot the idea of having a placeholder for the lhs, as this would abbreviate long expressions in many cases. But then, it would be still syntactic sugar only, as programmers won't be able to extend it for their own types in order to provide efficient in-place implementations for them. (As one would be able to do by overriding the
+=
operator in C++ or the__iadd__
method in Python...)
Neither an augmented variant of assignment (+=
) nor a symbol that refers to the previous value of the left-hand side of an assignment are just "syntactic sugar". They must have semantic guarantees that the expressions on the left-hand side of the assignment are not evaluated more than once, and this property is not one that can be preserved with simple rewriting transformations. For example, rewriting A(IFUNC()) += 1
to A(IFUNC()) = A(IFUNC()) + 1
would probably be a bad idea.
One of the reasons the alterative approach of using a symbol that designates the previous value of the left-hand side is more natural than augmenting variants of assignment(s) is that you can use your user-defined OPERATOR(+)
in place, as it were, without having to also define an ASSIGNMENT(+=)
, if such a thing were possible. This might not be what you'd want if your +=
had semantics that were not the same as your +
in the context of assignment (e.g., +=
is somehow atomic), but for many use cases the "left-hand side symbol" approach will save boilerplate.
Maybe there should be an independent issue proposing the LHS symbol? While its usage has some overlaps with augmented assignment, it is really a different thing as it extends way beyond what augmented assignment can do, but is less intuitive to use and read (especially for people who switch from other languages).
@klausler OK, I see your point. But, then, similar to what have been proposed above, an extension of the current associate
construct may already be enough to enable it? Actually, I think, by introducing the statement form of associate
one could already have it (or do I miss something obvious?)
associate(lhs => A(IFUNC())) lhs = lhs + 1
associate(c => complex_array(i, j)) &
& cmplx(real(c) + 1.0_real64, aimag(c) * 3, kind=real64)
It won't be as compact as other forms proposed, but it would be for sure a very natural extension of the current language.
Actually, augmented assignment operators (+=
, *=
, etc.) only make sense to me, if
+
and *
have augmented assignment equivalents, but -
and /
not) ANDIf any of the two is not given, I'd rather go with the LHS-association. That seems to fit more to the current language and would allow for a much more flexible usage.
ASSOCIATE
doesn't create a modifiable association for designators containing vector-valued subscripts or coindexes today. If your left-hand side "associate" syntax doesn't support those, it's too limited, and if it does, it shouldn't use the same keyword. We also don't need an name for the left-hand side; there's can't be another assignment statement in its scope, so a single left-hand side symbol suffices.
In the renewed discussion mentioned in https://github.com/j3-fortran/fortran_proposals/issues/113#issuecomment-905758722, there appears the idea that
A /= B
might actually work, since assignment isn't allowed in expressions, so there should be little risk of mixing up
IF( A /= B ) THEN ! or
L = A /= B
with
A /= B
as assignment for both compilers and human code readers. While not ideal, I'd take such a solution gladly over lines such as
forceFromExternalSource = scaleFactor * forceFromExternalSource
allow writing
instead of
and
instead of
I think this feature is quite important, since the lack of it misleads programmers towards the use of (too) short variable names: I would write like to write this:
But maybe prefer this
over this abomination of a line =)
This should not end in a discussion whether to use long variable names or not, let's just agree, that they are better sometimes.
This is a second use case
This is a third (and fourth) use case