j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
178 stars 15 forks source link

Namelist object designators may have blanks following commas #135

Open marshallward opened 4 years ago

marshallward commented 4 years ago

Currently, the following namelist is invalid:

idx_nml
   v(1, 1) = 5
   v(2, 1) = 5
/

The reason is because the object designator (e.g. v(1, 1)) of a namelist may not contain any blanks (13.11.2p2, last sentence):

Each designator may be preceded and followed by one or more optional blanks but shall not contain embedded blanks.

There is no such restriction AFAIK in Fortran itself.

Despite this restriction, such spaces are permitted in every compiler that I have tested (gfortran, ifort, Cray, PGI). The exception is NAG Fortran, which is correctly adhering to the standard.

Some discussion here, for the interested.

I would suggest that the final sentence be removed from this paragraph of the standard, if only because it is inconsistent with Fortran source code, and because hardly anyone appears to be following this rule anyway without any apparent side-effects.

certik commented 4 years ago

@marshallward the link in your post is invalid.

marshallward commented 4 years ago

Weird, issues became issue. Anyway it is fixed now, thanks.

FortranFan commented 4 years ago

Another one for #106, a no-brainer in terms of a change which should get absorbed into the Fortran standard at fairly short notice rather than the 8+ years it's going to take in the current scheme of things.

sblionel commented 4 years ago

@FortranFan , your continued harping on past implementation delays is not helpful. It is also not accurate for today's compiler landscape, where the leading compilers are quickly catching up to the current standard, and standard development is moving faster.

I suspect that this restriction is in place to disallow something like a%b %c, but I agree that embedded blanks that would be allowed in free-form source should also be allowed here.

FortranFan commented 4 years ago

@sblionel wrote:

@FortranFan , your continued harping on past implementation delays is not helpful. It is also not accurate for today's compiler landscape, where the leading compilers are quickly catching up to the current standard, and standard development is moving faster. ..

@sblionel, what is "not helpful" and is entirely uncalled for here is a personal, singling out attempt at a censure of my impersonal sentence upthread with mischaracterization such as "harping".

Instead, what will be helpful here is to note the non-Fortrannic phrase in the standard for NAMELIST formatting i.e., Section 13.11 Name-value subsequences, paragraph 2, lines 35 and 36, "Each designator may be preceded and followed by one or more optional blanks but shall not contain embedded blanks"

That perhaps this phrase might be an oversight but yet no speedy review of it, let alone any redressal, appears feasible. Fortran 202X is "closed" for instance.

My first thought when I read the original post is the phrase, "but shall not contain embedded blanks," appears entirely questionable for a language that not only allows blanks in many contexts (e.g., x % n = 42 but which requires any and all including new processors to continue to support fixed-form source with its "semantics" toward "embedded blanks". That the standard will permit something like the following:

      P R    O   G   R    A   M   B          L          A     N     K  S
      T   Y      P        E     F        O                             O
      I N T   E  G   E                                   R   X   Y     Z
      E      N          D         T             Y           P          E
      T           Y           P          E     B     A                 R
      T       Y        P         E    (    F      O        O )  A  B   C
      E          N        D          T       Y       P                 E
      T    Y        P    E    (   B      A  R     )                    A 
      A            %   A       B       C        %     X   Y   Z   = 4  2
      PRINT *, "A%ABC%XYZ = ", A%ABC%XYZ, "; Expected is 42"
      S           T    O                                               P
      END PROGRAM BLANKS

but yet strive to maintain as non-conforming a simple NAMELIST input of &dat a % b % c = 42/. That is inconsistent and it is doing disservice to a Fortran practitioner.

C:\Temp>type p.for

P R    O   G   R    A   M   B          L          A     N     K  S
T   Y      P        E     F        O                             O
I N T   E  G   E                                   R   X   Y     Z
E      N          D         T             Y           P          E
T           Y           P          E     B     A                 R
T       Y        P         E    (    F      O        O )  A  B   C
E          N        D          T       Y       P                 E
T    Y        P    E    (   B      A  R     )                    A
A            %   A       B       C        %     X   Y   Z   = 4  2
PRINT *, "A%ABC%XYZ = ", A%ABC%XYZ, "; Expected is 42"
S           T    O                                               P
END PROGRAM BLANKS

C:\Temp>ifort p.for Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.0.5.281 Build 20190815 Copyright (C) 1985-2019 Intel Corporation. All rights reserved.

Microsoft (R) Incremental Linker Version 14.24.28314.0 Copyright (C) Microsoft Corporation. All rights reserved.

-out:p.exe -subsystem:console p.obj

C:\Temp>p.exe A%ABC%XYZ = 42 ; Expected is 42

C:\Temp>gfortran p.for -o p-gnu.exe

C:\Temp>p-gnu.exe A%ABC%XYZ = 42 ; Expected is 42

C:\Temp>

certik commented 4 years ago

@FortranFan please do not bring up the speed of adoption in technical issues. Your first comment:

Another one for #106, a no-brainer in terms of a change which should get absorbed into the Fortran standard at fairly short notice rather than the 8+ years it's going to take in the current scheme of things.

Could have been instead written as:

Another one for #106, a no-brainer in terms of a change which should get absorbed into the Fortran standard at fairly short notice.

If you have comments regarding Fortran standard adoption, please discuss this in separate issues, such as #106, #36 or create a new one, and just link them here. (Btw, I agree with you and I am as unhappy as you are regarding #106, #36, in fact I created those issues --- but we have to focus and not pollute every issue like this #135 with our unhappiness, because that will only get people annoyed, and not get us what we want.)

sblionel commented 4 years ago

I observe that this restriction has been there since Fortran 90.

marshallward commented 4 years ago

The example alluded to by @sblionel and expanded on by @FortranFan does fail on gfortran, e.g.:

&test
    x % a = 1
    x % b = 2
/

In this case, it aborts because it wants to see a = after the space-delimited variable, x. But I believe that x % a = 1 would be a valid Fortran statement. Other spacings like x %a = 1 fail because it rejects %a as a valid variable, again due to space-delimiters.

My guess is that v(1, 1) = 5 only works because the (, ) are acting as delimiters of a subsection of the designator. Something like x % a has no delimiters, and I can see how it may be a challenge (maybe impossible?) to parse correctly.

I now see some purpose in the current wording, and I no longer think it's just a simple matter of relaxing the zero-blanks requirement of the designators.

FortranFan commented 4 years ago

@marshallward wrote:

.. I now see some purpose in the current wording, and I no longer think it's just a simple matter of relaxing the zero-blanks requirement of the designators.

Why is the percent sign not a good delimiter when equals sign is in NAMELIST formatting and so is exclamation mark for comments and beginning and ending ampersands for complex objects?

      p  r  o  g   r  a   m  b   l  a  n   k   s   g  a    l  o  r     e
      c h a r a c t e r(len=1), parameter :: N L = n e w _ l i n e( "" )
      c   o    m    p    l     e            x                          z
      c h a r a c t e r ( l e n = : ), a l l o c a t a b l e : : i n p
      n    a      m     e    l     i   s     t  /   d   a     t   /    z
      i   n   p    =   "  &dat   " // NL // "!       blanks galore " / /
     &  N  L / / "  z   =  (    1.0   ,    0.0      )  " // NL // " /  "
      p    r       i      n      t *  ,      i             n           p
      r  e   a   d   (  i   n    p  ,   n  m   l   =    d    a  t      )
      print *, "z = ", z, "; expected is (1.0,0.0)"
      e   n    d p  r  o  g   r  a   m  b  l a n  k  s  g  a  l  o  r  e

C:\Temp>type p.for

p  r  o  g   r  a   m  b   l  a  n   k   s   g  a    l  o  r     e
c h a r a c t e r(len=1), parameter :: N L = n e w _ l i n e( "" )
c   o    m    p    l     e            x                          z
c h a r a c t e r ( l e n = : ), a l l o c a t a b l e : : i n p
n    a      m     e    l     i   s     t  /   d   a     t   /    z
i   n   p    =   "  &dat   " // NL // "!       blanks galore " / /
&  N  L / / "  z   =  (    1.0   ,    0.0      )  " // NL // " /  "
p    r       i      n      t *  ,      i             n           p
r  e   a   d   (  i   n    p  ,   n  m   l   =    d    a  t      )
print *, "z = ", z, "; expected is (1.0,0.0)"
e   n    d p  r  o  g   r  a   m  b  l a n  k  s  g  a  l  o  r  e

C:\Temp>gfortran -std=f2018 p.for -o p-gnu.exe

C:\Temp>p-gnu.exe &dat ! blanks galore z = ( 1.0 , 0.0 ) / z = (1.00000000,0.00000000) ; expected is (1.0,0.0)

C:\Temp>

marshallward commented 4 years ago

@FortranFan You may be right, I also suspect that this ought to be unambiguously parseable. I'm only suggesting that the problem is more complicated than my original example suggests and probably needs more discussion.

And there's the problem that it is not parseable by a greater number of compilers, so I can't simply say "nearly everyone supports this already". My x % a = 1 example above fails on GCC, for example.

Fortran source has the luxury of having \n as a delimiter between statements and can usually treat spaces as null characters, whereas namelists must treat both as potential delimiters. So at the least one cannot expect all of the Fortran rules to carry over on default.

For example, these are valid namelists which would not be valid Fortran statements:

&test1
    x
    =
    1
/
&test2 x=1 y=2 /
sblionel commented 4 years ago

As someone who has actually implemented NAMELIST in a compiler's (VAX FORTRAN) I/O library, the newlines aren't a real problem. Unlike source, list-directed and NAMELIST reads are perfectly capable of spanning record boundaries, as long as the input item is requesting more data.

The standard says, for free-form source, that a "token" cannot have embedded blanks. In the case of x%a, each of x, % and a are tokens, and the input processor is perfectly capable of handling this in an unambiguous fashion, if so implemented, even split across record boundaries. Once an identifier is seen (x here), it has to keep looking until it sees a =.

As I noted earlier, I think it would be adequate to say that tokens in the designator may not have embedded blanks, adopting the language the standard has for free-form source:

In free source form blank characters shall not appear within lexical tokens other than in a character context or in a format specification. Blanks may be inserted freely between tokens to improve readability; for example, blanks may occur between the tokens that form a complex literal constant. A sequence of blank characters outside of a character context is equivalent to a single blank character.

marshallward commented 4 years ago

For what it's worth, my own parser also seems to have no trouble with interior whitespace:

>>> import f90nml
>>> print(f90nml.reads("&test x % a = 1 x % b = 2 /"))
&test
    x%a = 1
    x%b = 2
/

so I am also inclined to agree that whitespace between any tokens should be permitted.

klausler commented 4 years ago

Why not ignore whitespace entirely outside character literals? Whitespace doesn't seem necessary for disambiguating tokens.

FortranFan commented 4 years ago

Why not ignore whitespace entirely outside character literals? Whitespace doesn't seem necessary for disambiguating tokens.

It'll be very useful, along the lines of the need expressed in the per original post here, if the Fortran standard can extend better support for blanks in NAMELIST formatting, at least as per how the standard treats them in free-form source c.f. 6.3.2.2 Blank characters in free form, paragraph 1, lines 17 and 18, "In free source form blank characters shall not appear within lexical tokens other than in a character context or in a format specification. Blanks may be inserted freely between tokens to improve readability." I've a really hard time believing how such an improved facility can be any difficult for a present-day parser or a processor.

Taking it to the level of "whitespace", a term currently unknown to the Fortran standard and which some may understand as implying more than blanks, perhaps tab characters, newline, carriage return, form feed, etc. may be "nice to have" - I don't know yet and which I'll defer.

But I'll be happy if last sentence in paragraph 2 of 13.11.2 Name-value subsequences under 13.11 Namelist formatting were to read: