j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
174 stars 14 forks source link

Implicit line continuation #130

Open marshallward opened 4 years ago

marshallward commented 4 years ago

I don't know if this is possible within the context of the compiler, or if it could potentially break huge amounts of existing code, but it would be nice to introduce some kind of implicit line continuation.

For example, the function declaration below has a &, but the parser ought to be able to infer from the open ( to keep looking for ) on the next line.

subroutine parse_segment_data_str(segment_str, var, value, filenam, &
                                  fieldnam, fields, num_fields, debug)

Arrays could also be implicit

x = [
    1.1,
    2.2,
    3.3
]

Has this been proposed in the past? Are there any inherent reasons why a Fortran parser cannot detect these cases?

certik commented 4 years ago

From a user perspective, I would like this a lot. It's quite annoying always having to put & at the end. I like the way Python does it --- almost always the parser can infer this automatically, only when it can't, you have to use \ as the line continuation character.

klausler commented 4 years ago
BEGIN =
  END
urbanjost commented 4 years ago

I would add to the list of implied continuation line indicators (in addition to array constructors and unmatched parenthesis) a line ending in a comma. Looking at a sample of 140 000, lines of free-format Fortran code 88% of the continuation lines were eliminated (there were 6 868 continuation lines). As an alternative, lines beginning with an open parenthesis would be continued till a matching close parenthesis.

The code looks cleaner and is easier to reformat with an editor. That is, instead of

write(,) left, & right, & top, & down

array=[ 10, 20, 30, & 40, 50, 60, & 70, 80, 90 ] call sub(one+two+ & three) write(,)'VALUES ARE NOW ', & &'x ',x,& &'y ',y,& &'z ',z,& &'point ',point,& &'title ',title,& &'help ',help,'h ',h,& &'version ',version,'v ',v,& &'l ',l,& &'l ',l DATA (C(I),I=1,19)/ -0.73804295108687506715D-01, & & 0.11366785079620443739D+02, -0.65838973034256501712D+02, & & 0.14119145750221817396D+03, -0.15929975325701922684D+03, & & 0.11122328958866232246D+03, -0.52866443153661476803D+02, & & 0.18223597971689250243D+02, -0.47661469297599122637D+01, & & 0.97840283604837466112D+00, -0.16191400580768858112D+00, & & 0.2212712874183229440D-01, -0.2606907391286968320D-02, & & 0.316831265267384320D-03, -0.6102072906743808D-04, & & 0.1658373309202432D-04, -0.3439710458347520D-05, & & 0.338099825541120D-06, -0.343597383680D-09/

One could write

write(,) left, right, top, down

array=[ 10, 20, 30, 40, 50, 60, 70, 80, 90 ]

call sub(one+two+ three)

   write(*,*)'VALUES ARE NOW ', 
    'x        ',x,              
    'y        ',y,              
    'z        ',z,              
    'point    ',point,          
    'title    ',title,          
    'help     ',help,'h ',h,    
    'version  ',version,'v ',v, 
    'l        ',l,              
    'l_       ',l_

DATA (C(I),I=1,19)/              -0.73804295108687506715D-01,         
     0.11366785079620443739D+02, -0.65838973034256501712D+02,    
     0.14119145750221817396D+03, -0.15929975325701922684D+03,   
     0.11122328958866232246D+03, -0.52866443153661476803D+02,  
     0.18223597971689250243D+02, -0.47661469297599122637D+01, 
     0.97840283604837466112D+00, -0.16191400580768858112D+00,
     0.2212712874183229440D-01,  -0.2606907391286968320D-02,       
     0.316831265267384320D-03,   -0.6102072906743808D-04,         
     0.1658373309202432D-04,     -0.3439710458347520D-05,        
     0.338099825541120D-06,      -0.343597383680D-09/
marshallward commented 4 years ago

I would guess that cases like this would be difficult to parse

write(,) left,
right,
top,
down

since end-of-line is still an important token in Fortran parsing. The comma may not be sufficient to infer line continuation, I am not sure. (That would not have worked in Python, due to how it handles tuples, but Fortran has no such burden.)

Also, I think I finally understand @klausler's comment (thanks @septcolor for clarifying). I agree that there are many such statements would be unparseable without line continuation tokens.

To clarify, I am only thinking of cases where there is an explicit subtoken like (...) or [...] providing a nested statement.

certik commented 4 years ago

I can see how this might be hard (impossible?) to do with the current semi-reserved keywords in Fortran. I opened an issue #167 to discuss that. Unfortunately for the above proposal, it's probably best if the standard itself keeps the semi-reserved keywords, so the above proposal probably cannot be standardized.

klausler commented 4 years ago

Fortran doesn't have reserved keywords, semi- or otherwise. But it does have special rules for things that look like END statements.

Actual implementations of Fortran need to be able to have some separation of concerns for all of the tasks that have to occur during and before parsing. Some compilers (esp. ones built with parser generators) need to have a tokenization module acquiring tokens from the source and passing them on to the parser proper, and the state of the parse needs to affect the tokenization rules so that things like MODULEPROCEDUREFOO get dealt with correctly. Other compilers (recursive descent with backtracking) normalize the source, dealing with line continuation and INCLUDE and preprocessing along the way, without knowledge of the partial parse.

These proposals that mess with line continuation rules would be hard to implement in both of these kinds of compilers -- those with tokenization steps that handle line continuation will need more state from the parser to handle the implicit continuation, and those that normalize the source stream before beginning any parsing will have to track things like parenthesis/bracket nesting and dangling = signs while making line continuation decisions. All of this would make error recovery harder, too, and that's the last thing you want with Fortran.

marshallward commented 4 years ago

Thanks @klausler, I figured there would be restrictions on what might be possible. I can appreciate that it may be extremely problematic to implement. Do you see any particular challenges with, say, implicit line continuation for arrays?

Also I'd be keen to know if there are any other examples where line continuation would be either impossible to resolve, or even just too problematic or computationally expensive.

BTW I do not plan or ever expect to see this proposed, I am more interested in whether it's possible or what could be done here. If you are busy, then don't let this take up any more time!

certik commented 2 years ago

Also proposed at https://fortran-lang.discourse.group/t/make-line-continuation-operator-optional/2176.

8bitmachine commented 2 years ago

I would certainly like to have implicit line continuation. The & & thing is quite clunky.

waynelapierre commented 8 months ago

any updates?

certik commented 8 months ago

It's waiting for somebody to create a compiler prototype and then write up a proposal with the exact rules.

feenberg commented 7 months ago

Back in the day Ratfor (an excellent Fortran preprocessor by Brian Kernighan) continued any line ending with an operator or a comma (except slash didn't continue a data statement). It is an attractive and simple to parse method of indicating continuation. Ratfor did not count parens for this purpose, as was pointed out this could be difficult for some compilers.

smoothdeveloper commented 5 months ago

I see Python being mentioned as having this; as an F# user, who also do a bit of Python, I can say F# (IMO) does it better, it never requires continuation, while Python still requires it at times.

F# may be finicky if you don't indent your code properly (a block is detected if it is indented more than 1 character, and there are some relaxation cases), like Python.

If indentation awareness would help determine ambiguous case, I'd recommend considering how F# has done, which is well explained in the spec section 15.1.

oxcrow commented 1 month ago

@certik I slightly disagree with this proposal, and think there is a better solution.

I wrote about it in Fortran's discourse group: https://fortran-lang.discourse.group/t/line-continuation-without-implicit-assumptions/8450