Open milancurcic opened 4 years ago
Sadly, this would require removing the faulty feature of implicit save (#40), and we already know where this is gonna go.
This has been requested by some of my colleagues at LANL also. It is common to have a 50 or 100 lines long subroutine with loops which is not really feasible to split it into further subroutines, but it would help to declare variables where they are used, as it would simplify reading the code, if you can see the type of the variable close to where it is used.
Agreed, this would be very helpful. If it is too controversial, perhaps we could allow variables to be declared with local scope at the start of block constructs (if/then statements, loops, etc.). Essentially we'd be saying that each of them is implicitly a block construct.
Sadly, this would require removing the faulty feature of implicit save (#40), and we already know where this is gonna go.
@gronki Can you explain why? Implicit save would remain regardless of where declaration happens. Maybe I'm missing something.
Being able to declare the variable right where it's given a value is going to make people want to do it all on one line... and then disaster will strike.
I think there is a way to fix #40 that's backwards compatible (several approaches are suggested there), so if there is enough support for this from the Fortran community, we can make this happen.
Being able to declare the variable right where it's given a value is going to make people want to do it all on one line... and then disaster will strike.
Good! It'll make people learn that pitfall quickly.
I still don't see a technical conflict with implicit save, hopefully @gronki can explain. As far as I can tell they're compatible both in terms of syntax and semantics.
Being able to declare the variable right where it's given a value is going to make people want to do it all on one line... and then disaster will strike.
Good! It'll make people learn that pitfall quickly.
I find it hard to characterize this as a good thing, myself.
@klausler Do you think it's a bad thing, or neutral, and why?
My reasoning for why it's good: Implicit save is an obscure feature. If a novice Fortran programmer is not aware of it, the sooner they encounter it the better, because that will be an opportunity to learn about it and (hopefully) never use it again.
In contrast, as the novice Fortran programmer builds knowledge, experience, and code-base, while still not becoming aware of implicit save, it becomes more likely that this will appear as a bug in code that shipped, whether it's an open source library, or a student class project.
The existence of the implicit SAVE
pitfall is the bad thing.
That I agree with, and has little to do with what I wrote. :)
@milancurcic,
they might seem compatible but they are not. Please take note that while in the following code:
subroutine do_xyz
integer :: counter = 0
integer :: i
read (*, *) i
counter = counter + i
you can kind of forgive what was in mind of those designing it: since variable declarations are in what can be considered a "header" of a code block, the value zero is in some way pre-assigned to it before the first execution. Whereas if you declare it mid-code:
x = y + z
real :: w = 0
w = w + x
it is quite obvious that in the third line you expect that w = 0. Which, in this case, will be true only in the first call. Thus leading to quietly producing incorrect results.
I think static/save variables should remain declared in the beginning of the block only. Otherwise it's making the implicit save problem only deeper and more burning.
And I agree with @cmacmackin that probably allowing declaring variables in more places (beginning of if blocks, within the loops, in the do loop header similarly to forall) would probably resolve most of the problems. As long as you can see the declaration on the same screen as assignment it should be good.
Last but not least, the implicit save feature is so toxic that I would think that committee should consider making an exception from "backwards compatibility" argument here. Like forall, this feature never really caught up, but led to lots of confusion and is misleading. Compiler vendors will probably maintain some compatibility mode for the rare customers that use it.
it is quite obvious that in the third line you expect that w = 0. Which, in this case, will be true only in the first call.
This is the argument against implicit save. It holds regardless of where you put the declaration statement.
Thus leading to quietly producing incorrect results.
No -- this is the correct behavior according to implicit save, whether we like it or not.
As I understand it, you're talking about what's intuitive and what we'd expect some code to do. I'm talking about whether they are technically compatible. They still seem very much so!
Yes, I admit there is an issue if a positive feature (declare anywhere) makes a negative feature (implicit save) even more negative. I explained in the earlier message why I think this is a sum-positive.
If I could help in any way, I'd love to see implicit save deprecated and eventually deleted, and declare anywhere enabled, concurrently. But I think two separate issues are conflated here. They should be differentiated.
I don't think an existing negative feature should be a roadblock to implementing other, positive features to the language.
I don't think these can be differentiated. Intuitiveness is one of the primary factor behind design. It's also one of the main reasons why we all consider implicit save a bad design. I cannot see how making a bad design catastrophic will be net plus no matter the benefits. In this weighing you can either care more about what can be added to the language (your approach) or what should be forbidden in the language to avoid mistakes (my approach). I think we have to agree to disagree here. ;)
czw., 14 lis 2019 o 01:31 Milan Curcic notifications@github.com napisał(a):
it is quite obvious that in the third line you expect that w = 0. Which, in this case, will be true only in the first call.
This is the argument against implicit save. It holds regardless of where you put the declaration statement.
Thus leading to quietly producing incorrect results.
No -- this is the correct behavior according to implicit save, whether we like it or not.
As I understand it, you're talking about what's intuitive and what we'd expect some code to do. I'm talking about whether they are technically compatible. They still seem very much so!
Yes, I admit there is an issue if a positive feature (declare anywhere) makes a negative feature (implicit save) even more negative. I explained in the earlier message why I think this is a sum-positive.
If I could help in any way, I'd love to see implicit save deprecated and eventually deleted, and declare anywhere enabled, concurrently. But I think two separate issues are conflated here. They should be differentiated.
I don't think an existing negative feature should be a roadblock to implementing other, positive features to the language.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/81?email_source=notifications&email_token=AC4NA3OUEJXDP5UQ6MTCSNTQTSL47A5CNFSM4JMJV3Y2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEAFAFA#issuecomment-553668628, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC4NA3ORQ6KJTAB4BUB3MSLQTSL47ANCNFSM4JMJV3YQ .
OK, thanks for clarifying -- it's a matter of design rather than a technical issue. I didn't understand that from your original message and I agree with that part.
Making a negative feature appear more negative helps your persuasion when you argue for removing that feature. Things that are more obviously bad are more difficult to ignore. This proposal will boost the argument against implicit save.
On Wed, Nov 13, 2019, 20:02 Dominik Gronkiewicz notifications@github.com wrote:
I don't think these can be differentiated. Intuitiveness is one of the primary factor behind design. It's also one of the main reasons why we all consider implicit save a bad design. I cannot see how making a bad design catastrophic will be net plus no matter the benefits. In this weighing you can either care more about what can be added to the language (your approach) or what should be forbidden in the language to avoid mistakes (my approach). I think we have to agree to disagree here. ;)
czw., 14 lis 2019 o 01:31 Milan Curcic notifications@github.com napisał(a):
it is quite obvious that in the third line you expect that w = 0. Which, in this case, will be true only in the first call.
This is the argument against implicit save. It holds regardless of where you put the declaration statement.
Thus leading to quietly producing incorrect results.
No -- this is the correct behavior according to implicit save, whether we like it or not.
As I understand it, you're talking about what's intuitive and what we'd expect some code to do. I'm talking about whether they are technically compatible. They still seem very much so!
Yes, I admit there is an issue if a positive feature (declare anywhere) makes a negative feature (implicit save) even more negative. I explained in the earlier message why I think this is a sum-positive.
If I could help in any way, I'd love to see implicit save deprecated and eventually deleted, and declare anywhere enabled, concurrently. But I think two separate issues are conflated here. They should be differentiated.
I don't think an existing negative feature should be a roadblock to implementing other, positive features to the language.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub < https://github.com/j3-fortran/fortran_proposals/issues/81?email_source=notifications&email_token=AC4NA3OUEJXDP5UQ6MTCSNTQTSL47A5CNFSM4JMJV3Y2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEAFAFA#issuecomment-553668628 , or unsubscribe < https://github.com/notifications/unsubscribe-auth/AC4NA3ORQ6KJTAB4BUB3MSLQTSL47ANCNFSM4JMJV3YQ
.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/j3-fortran/fortran_proposals/issues/81?email_source=notifications&email_token=AA7RDPQASQ26VZOA3FGJL6LQTSPR3A5CNFSM4JMJV3Y2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEEAGWUA#issuecomment-553675600, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7RDPW46NLIRHLSL5ZUL2TQTSPR3ANCNFSM4JMJV3YQ .
I would object to allowing declarations anywhere, as variables tend to be used throughout large sections of code and the reader would have to scan the entire procedure to find the declaration.
If you're truly using a variable in a limited context, use BLOCK
- that's what it's for.
Thanks Steve. I understand the argument to keep this restriction because the reader then always knows that the declarations are on top, without exception.
Removing this restriction would affect only new code written. I don't know, but I think that most uses of this would be to declare the variable immediately before it's first used. In my opinion, there's value to this, especially in the era of modern text editors with easy pattern search through the file. If nothing else, it allows the programmer to adopt the preferred development practice for their project.
Yes, technically block
could be used for this, but limiting the local scope for variables is not the intent of this proposal.
@sblionel @klausler is there any reason from the compiler implementation/performance side for not allowing variable declarations to be in the beginning of constructs like if
and do
? From what I know, in C++ such variables are usually easily optimized out, but in code they contribute to readibility.
I have been using block
for a while and I think it's a genius and versatile construct. I structure my codes a lot using it. But perharps the ability to declare variables could be extended to a few more constructs. For example, block
behaves somewhat similar to do i = 1,1
(executed once). Very often variables are used within a loop and are only relevant for the local context. I see having to declare a variable total
on the top of the procedure just to use it in within one nested loop as very bad for cleanliness and readibility. (Or even worse, used in a few different loops in different meanings.)
I still don't know if declarations everywhere are the best idea, but clearly there must be a way to make it easier to restrict the scope of variables.
I like this a lot as well. If there is to be a proposal for reducing the restriction, it should bring to the table and discuss both (and any other) approaches.
@sblionel @klausler is there any reason from the compiler implementation/performance side for not allowing variable declarations to be in the beginning of constructs like
if
anddo
? From what I know, in C++ such variables are usually easily optimized out, but in code they contribute to readibility.I have been using
block
for a while and I think it's a genius and versatile construct. I structure my codes a lot using it. But perharps the ability to declare variables could be extended to a few more constructs. For example,block
behaves somewhat similar todo i = 1,1
(executed once). Very often variables are used within a loop and are only relevant for the local context. I see having to declare a variabletotal
on the top of the procedure just to use it in within one nested loop as very bad for cleanliness and readibility. (Or even worse, used in a few different loops in different meanings.)I still don't know if declarations everywhere are the best idea, but clearly there must be a way to make it easier to restrict the scope of variables.
The grammar of the Fortran language pretty clearly distinguishes the specification part of a subprogram from its execution part. While a few statements (e.g., FORMAT
and DATA
) can appear in each (at least today), the two parts are almost like distinct programming languages. Older compilers that perform semantic analysis as they parse would complete the definitions in the symbol table when the parse detected the transition from the specification part to the executable part.
Today, the only concern that I might raise about intermixing some (which?) specification statements into the executable part is that it would slow down the parser on all codes -- it'll have more possibilities to consider in the execution part. But wouldn't want to raise such a concern without first prototyping the feature and measuring the performance difference.
I doubt that it would be prohibitive, and I hope that it's not, because -- at least in the case of type-declation-stmt with a "no implicit SAVE" option, such inline usage greatly improves readability of code in other languages. I consider it a best practice in C++ to always initialize each local variable as I declare it with the value of an expression that usually involve references to other declared and initialized locals, similar to a let
or where
block in Haskell (apart from ordering).
So the potential value of this feature depends on solving the "implicit SAVE
due to initialization" problem, since it seems less useful absent initializers on the intermixed declarations. I suggest that a combined proposal be drafted.
I'm not sure it helps much to allow other specification statements to leave the specification-part.
Agreed, this would be very helpful. If it is too controversial, perhaps we could allow variables to be declared with local scope at the start of block constructs (if/then statements, loops, etc.). Essentially we'd be saying that each of them is implicitly a block construct.
I see how this could be done for the internal scope of if/then statements and do loops, but I cannot see it done in a nice way for the iterator of a loop. Would the declaration statement be directly above the do loop, or on the same line like do integer i = 1, n
? Perhaps fit it in the newly suggested modern do, i.e. do (integer :: i, j: i=1:n, j=1:m)
construct (see https://github.com/j3-fortran/fortran_proposals/issues/85)?
While I agree this is one of the features I like most from C++, like @sblionel suggested, in Fortran it can already be done with the block
construct (I found some nice examples here). Of course the solution matches the general verbosity of Fortran. A good editor with snippet completions can help to reduce the typing effort.
Perhaps we just need a preprocessor to convert curly brackets into block constructs? :wink:
if (condition) then {
! do something
} end if
I don't see how this issue interferes with "implied save" any more than intermixing statements and declarations interferes with the "static" attribute in C or C++. Compilers for those languages handle intermixing both auto and static declarations with statements quite well; so it shouldn't be a problem for Fortran compilers to use similar methods to handle intermixed declarations in conjunction with the "save" attribute.
Also, I don't see why "implicit save" should be removed, either. If you need a way to prevent "save" being implicitly attached to a local, then provide either a compiler switch for this (e.g. "-auto") or appropriate a keyword for this; e.g. "implicit no save" when implied save is enabled and "implicit save" when implied save is disabled, even "implicit save(A-Z)", or "implicit no save(A-Z)" (or maybe also "implicit none (save)"). A standard, over time, can then change the default from being "implicit save until otherwise disabled" to "implicit no save until otherwise disabled", which requires only a change in a compiler switches when recompiling an older module, with no intrusion into the module itself, and minimal disruption of existing programs.
This was requested also on Discourse: https://fortran-lang.discourse.group/t/declare-variables-anywhere/2179.
Agreed, this would be very helpful. If it is too controversial, perhaps we could allow variables to be declared with local scope at the start of block constructs (if/then statements, loops, etc.). Essentially we'd be saying that each of them is implicitly a block construct.
I strongly agree with this approach, i.e. any construct (do
, if
, where
...) would also implicitely be a block
construct. OK, it would be just a syntactic sugar to replace
do i=1, n
block
<declarations>
<instructions>
end block
end do
nonetheless it would be more readable. And it would not be a big deal for the compilers...
I actually find this solution more structured than allowing declarations anywhere.
Until the implicit SAVE
problem is fixed, having local declarations (with explicit BLOCK
or not) won't cover the motivating use case of declaring and dynamically initializing a variable with a limited scope of use. So long as one has to declare and then dynamically initialize a variable as two distinct statements, you might as well declare it once at the top of the subprogram.
@klausler Your answer also invalidates the principle of declaring the variable in the block
construct. Still, it's there and many people use it (me included), primarily because it is useful. And I almost never use save
, implied or not. I can accept the opinion that declaring all variables before the instructions is a better practice, but refusing a (very) small evolution of the language because some people are confused by a largely unrelated feature (and that have little chance to be fixed in a near future) is something I don't really understand.
The original proposed solution already works in LFortran:
program a_long_program
integer :: a
a = 42
! many lines of code follow
integer :: b
b = 2 * a
! many lines of code follow
integer :: c
c = b**2
print *, 'The result is ', c
end program a_long_program
It gives:
$ lfortran a.f90
The result is 7056
Regarding the implicit save, that is also "fixed" by LFortran as follows. The following code:
program implicit_save
call f()
contains
subroutine f()
integer :: c = 5
c = c**2
print *, 'The result is ', c
end subroutine
end program
gives:
$ lfortran a.f90
warning: Assuming implicit save attribute for variable declaration
--> a.f90:8:14
|
8 | integer :: c = 5
| ^^^^^ help: add explicit save attribute or initialize in a separate statement
Note: Please report unclear or confusing messages as bugs at
https://github.com/lfortran/lfortran/issues.
The result is 25
$ lfortran a.f90 --no-warnings
The result is 25
If you find any bugs related to these two extensions, please let us know.
@certik In your LFortran implementation, I assume that the scope of the variables that are declared inside control structures like if/then/else blocks follows the same rules as in C?
@jacobwilliams
Being able to declare the variable right where it's given a value is going to make people want to do it all on one line... and then disaster will strike.
Actually I don't think so. Because very quicky people will try to write something like
real :: z = x + y
and it won't compile. Then they will (normally) try understanding why it doesn't compile.
I assume that the scope of the variables that are declared inside control structures like if/then/else blocks follows the same rules as in C?
We don't allow it inside control structures yet. Would should the scope be? It seems like in C is the most "natural". So it would be equivalent to a Fortran "block" inside the control structure.
So it would be equivalent to a Fortran "block" inside the control structure.
I like the idea that every control structure would be also an implicit block
, it looks very natural to me and more structured and "fortranic" than the "declare anywhere" version. For, it's quite common to need a variable just inside a loop, just inside an if/endif block, etc, and I'm sure it covers 99% of the needs. The variables would still be declared at the beginnig of these implicit blocks, which means a very minor change in the standard (just: "A control structure is also an implicit block")
I think "control structure is an implicit block" is a separate (but related) proposal. The above use case (the description of this issue) would not be fixed/solved by it I think, but both proposals should be designed together I think.
So it would be equivalent to a Fortran "block" inside the control structure.
I like the idea that every control structure would be also an implicit
block
, it looks very natural to me and more structured and "fortranic" than the "declare anywhere" version. For, it's quite common to need a variable just inside a loop, just inside an if/endif block, etc, and I'm sure it covers 99% of the needs. The variables would still be declared at the beginnig of these implicit blocks, which means a very minor change in the standard (just: "A control structure is also an implicit block")
If existing blocks (in the sense of the standard) were to be naively interpreted as if they were BLOCK
constructs, some existing code would become invalid and/or silently change behavior.
If existing blocks (in the sense of the standard) were to be naively interpreted as if they were
BLOCK
constructs, some existing code would become invalid and/or silently change behavior.
Can you give some examples?
I haven't dug the standard, but p.169 of "Modern Fortran Explained (F2018 edition)" it is said:
Adding a block construct to existing code has no effect on the semantics of the existing code.
I haven't dug the standard, but p.169 of "Modern Fortran Explained (F2018 edition)" it is said:
Adding a block construct to existing code has no effect on the semantics of the existing code.
That claim turns out to not be the case in at least three distinct ways with the standard and/or existing compilers.
Maybe you're right, but what are the cases? Regarding the citation of "Modern Fortran Explained", probably it should be:
Adding a block construct to existing execution part of a code has no effect on the semantics of the existing code.
if (.true.) then
1 format('hello')
print 1
end if
violates F'2023 C1109 if the consequent block is wrapped in a new BLOCK
construct.
if (.true.) then
data j/1/
j = j + 1
end if
print *, j
violates F'2023 C1109 if the consequent block is wrapped in a new BLOCK
construct. Worse, for the majority of Fortran compilers that interpret DATA
in a BLOCK
construct as a declaration of a symbol, it would no longer initialize the symbol J
in the containing scope.
OK, thanks. In the second example, if a block is overlaid on the if
construct, ifx complains that data
shall not initialize a variable that is host-associated. It looks indeed to be the right interpretation. The problem here is that data
can appear in the execution part, and it's indeed a no-show for the implicit blocks.
For the first example, what is the reason why a block shall not begin with a format
statement (actually the same question hold for data
) ?
Regarding compilers that interpret data
in a block as a declaration that is local to the block, it seems that they are wrong.
Regarding compilers that interpret data in a block as a declaration that is local to the block, it seems that they are wrong.
Why?
I promise you, I research these things pretty thoroughly when they come up during the implementation of a Fortran compiler. The standard contains nothing that determines the issue one way or another, which happens more often than it should. And in such cases, the most portable interpretation is usually best for users.
I'm looking at this draft document. That's true that the text of the standard is quite unclear on this point, however there are two notes that disambiguate it IMO (they were also present in the F2008 standard):
p128, Note about implicit typing and blocks; it is clearly stated that in case of implicit typing, the scope of a variable is not limited to the block even if it doesn't appear outside of the block:
Implicit typing is not affected by BLOCK constructs. For example, in
SUBROUTINE S(N) . . . IF (N>0) THEN BLOCK NSQP = CEILING (SQRT (DBLE (N))) END BLOCK END IF . . . IF (N>0) THEN BLOCK PRINT *,NSQP END BLOCK END IF END SUBROUTINE
even if the only two appearances of NSQP are within the BLOCK constructs, the scope of NSQP is the whole subroutine S.
p191-192; Although Z
appears first within a block
, its scope is considered to be the whole subroutine:
SUBROUTINE S
. . .
SAVE
. . .
BLOCK
REAL X ! Not saved.
REAL,SAVE :: Y(100) ! SAVE attribute is allowed.
Z = 3 ! Implicitly declared in S, thus saved.
. . .
END BLOCK
. . .
END SUBROUTINE
Although these two notes do not mention the data
case, they describe very similar cases and it looks inconsistent to me to use a different rule for data
.
Note that ifx does comply with these 2 notes, while gfortran doesn't.
Those examples aren't relevant to DATA
, where compilers differ in their interpretation of DATA
as a declaration or not in BLOCK
.
Those examples aren't relevant to
DATA
, where compilers differ in their interpretation ofDATA
as a declaration or not inBLOCK
.
From the standard, I would say that data
is not a declaration at all. What these notes essentially say is that implicitly typed variables are not local to a block.
Those examples aren't relevant to
DATA
, where compilers differ in their interpretation ofDATA
as a declaration or not inBLOCK
.From the standard, I would say that
data
is not a declaration at all.
R507 declaration-construct IS ... data-stmt
What these notes essentially say is that implicitly typed variables are not local to a block.
BLOCK
PARAMETER(J = 123) ! implicitly typed
ALLOCATABLE A(:) ! implicitly typed
DIMENSION B(1) ! implicitly typed
EXTERNAL FUNC ! implicitly typed
POINTER P(:) ! implicitly typed
TARGET T ! implicitly typed
END BLOCK
Sorry, I didn't use the right words, I wanted actually to say "What these notes essentially say is that undeclared variables are not local to a block"...
But OK, it appears that data
is a declaration, so you're right.
Problem
Currently the Fortran standard requires that all declaration statements are in the declaration section of the program unit, for example:
and similar for other executable units.
Programs or procedures that are too long to fit on the programmer's screen can be more difficult to read and understand if variable references and their declarations can't be seen on the same page.
Proposal
Don't limit declaration statements to declarative section of the code. However, a variable must not be referenced before it's declared (assuming
implicit none
).The key premise is: Declare variables where you use them. You don't have to, but if you want, you can.
The outcome is improved code readability.
Example
Consider a long program. Normally, you put all declarations at the top:
This proposed feature would make the following program valid:
Comments