camfort / fortran-src

Fortran parsing and static analysis infrastructure
https://hackage.haskell.org/package/fortran-src
Other
48 stars 20 forks source link

Copying prior to pass-by-reference triggered by parentheses, which we ignore. #203

Open dorchard opened 2 years ago

dorchard commented 2 years ago

This Tweet alerted me to something in Fortran that I was not aware of: https://twitter.com/fortrantip/status/1479071485859962880

If an argument is passed to a subroutine/function that is a variable wrapped in parentheses, then a copy of the value is taken before applying pass-by-reference:

        module m
          implicit none
          contains
            subroutine double(i, i2)
              integer :: i
              integer :: i2
              intent(in) :: i
              intent(out) :: i2
              print *, "incoming double, i = ", i
              i2 = 2*i
              print *, "outgoing double, i = ", i
            end subroutine double
        end module m

        program copy
          use m
          implicit none
          integer :: j, z
          j=3
          call double(j, j)
          print *, "(1b) in main, j = ", j

          call double((j), j)
          print *, "(2) in main, j = ", j

        end program copy

We have a problem with this as our parser strips away extra parentheses: https://github.com/camfort/fortran-src/blob/master/src/Language/Fortran/Parser/Fortran90.y#L1027 However, this appears to be valid Fortran 90 code. I haven't been able to pin down when this was introduced into the language, but we should certainly do something about this!

raehik commented 2 years ago

Interesting. I certainly have some misconceptions about binders to variables, I'm fuzzy on rules around temporaries and dummys (are they the same?).

I don't like it, but I can't think of anything other than an ExpBracket Expression constructor for parsing. A post-parse pass would remove them, and annotate some constructors (or change them to a different one) where required. More concretely, we could do:

{-# LANGUAGE DataKinds #-}
data Expression = ExpValue (p :: Polarity) | ExpBracket Expression | ... -- only ExpValue has polarity
data Polarity = DirectValue | IndirectExpression

-- parser always parses @ExpValue 'DirectValue@, and parses brackets explicitly as @ExpBracket e@

collapseBrackets :: AST -> AST
-- * rewrite @ExpBracket (e :: ExpValue 'DirectValue)@ to @ExpValue 'IndirectExpression@
-- * remove all other @ExpBracket@s -- so none in AST after pass

(I believe using a lifted data type means we don't have to rewrite all pattern matches and so forth, like if we were to add a new constructor or a new field to an existing one.) This method is ready to handle other similar misses cleanly, but in exchange it's clunky/expensive. Maybe an alternative would be to annotate ExpValues with polarity (crap name but yknow, binary opposites) as above, but hope to handle it all during parsing:

EXPRESSION :: { Expression A0 }
...
| '(' VALUE ')' { setSpan (getTransSpan $1 $3) (ExpValue ($2 :: Value 'IndirectExpression)) }
| '(' EXPRESSION ')' { setSpan (getTransSpan $1 $3) $2 }
...

VALUE :: { Value p A0 }
...

I'm not sure if we can do that in the parser.

Maybe this is easier to consider from the opposite perspective of pretty printing. Since similarly, after parsing call double(j, j) or call double((j), j), we need to make sure it goes to a value we can inspect to produce the original accurately. I think the polarity idea works fine there. I suppose we'll chat about it on Monday!

raehik commented 2 years ago

We discussed on Monday and decided we should handle the function call usage only. So the changes are scoped to function calls (mainly StCall). See #204 for more commentary.

raehik commented 2 years ago

204 (merged) has the AST change. #206 reflects it in the flow/basic block analysis.