Open jwallwork23 opened 1 year ago
Hi @arporter. I've been trying to figure out how to implement this but I'm a bit lost to be honest.
I'm still happy to work on this but please could you provide some pointers? Many thanks!
Hi @jwallwork23. The first step is to ensure the parser keeps all comments and that PSyclone copes with the resulting parse tree. I made a start on this ages ago: https://github.com/stfc/PSyclone/compare/master...1247_allow_comments I'm just in the process now of investigating what state that branch is in...
...145 tests fail but I think this is largely because the PSyIR tree in many cases is no longer what we expect (has extra CodeBlocks) and we've been a bit lazy in some of our tests (hardcoding the location of the expected node rather than using walk
to search for it). A first step would therefore be to fix the test failures and also to implement support for some switch to allow the user to turn on/off the keeping of comments. (By default we'd keep it switched off I think.) This could be a command-line flag along the lines of --keep-comments
. The PSyIR itself does have some functionality for associating comments with nodes but that link up is not something to worry about for now.
Once we have the ability to handle Comments (as CodeBlocks), the next step would be to implement an fparser2 handler for comments that then checks whether they are in fact directives. If they are, it should construct the correct type of Directive
node, if not they could be thrown away or kept as CodeBlocks.
I'll bring in @hiker, @sergisiso and @rupertford as they may have thoughts on this too - it's potentially quite a big change.
Oh okay great, thanks. I'm happy to work on fixing tests if you don't mind me pushing to your branch?
That's fine and would be much appreciated :-)
That's fine and would be much appreciated :-)
No problem. One down!
My only suggestion would be that --keep-directives
would probably be better than --keep-comments
as that is what we're trying to do here. Whether we also keep comments or not could be (is?) a separate thing I think.
My only suggestion would be that
--keep-directives
would probably be better than--keep-comments
as that is what we're trying to do here. Whether we also keep comments or not could be (is?) a separate thing I think.
Oh yes, I hadn't thought about that but I agree: they are separate things.
I also agree with the plan @arporter
I agree with the distinction between comments and directives: directives could be added easily to the Fortran grammar (as statements), since they cannot be put e.g. in the middle of another statement. I would actually suggest to only support directives (maybe we could specify rules for which kind of directives are parsed and converted, thinking of e.g. compiler directives to indicate can-be-vectorised, which we might not want??)
One quirk of this branch is that it adds an additional blank line when formatting. For example, psyir/transformations/reference2arrayrange_trans_test.py::test_range
converts
program test
real, dimension(10) :: a
real :: b
a(2:4) = b
end program test
into
program test
real, dimension(10) :: a
real :: b
a(2:4) = b
end program test
rather than
program test
real, dimension(10) :: a
real :: b
a(2:4) = b
end program test
Is there a quick fix for this? I think such a fix would make a lot of the test failures go away.
That's probably an fparser issue. I can take a look.
Well, it's not really fparser - fparser is just interpreting a blank line as a Comment (presumably to preserve formatting). The issue is that some of the tests accidentally add in an additional newline to the source code to be processed.
It might be easiest to actually start adding a Comment handler in psyir/frontend/fparser2.py
and have that just throw-away empty comments but put anything else in a CodeBlock (which is achieved by having the handler raise a NotImplementedError
).
To do that, you need to add a handler to the map:
https://github.com/stfc/PSyclone/blob/53312078fbf7f1f07a8a4b501991c9a1a87dc784/src/psyclone/psyir/frontend/fparser2.py#L1063-L1067
and then implement the handler itself. As you might expect _return_handler
is a very simple example:
https://github.com/stfc/PSyclone/blob/53312078fbf7f1f07a8a4b501991c9a1a87dc784/src/psyclone/psyir/frontend/fparser2.py#L3803-L3818
Thanks @arporter. My first attempt is in 50342ae. However, it fails the basic test I added in 6e39754. Did I misunderstand something?
Thanks for the assistance @arporter. My test_ignore_blank_line
passes, but test_basic_comment
fails. It seems the comment ! hi
doesn't get picked up and turned into a CodeBlock
. Is it because _comment_handler
is only getting called at the top level?
I'm not sure. I've tried breaking out into the debugger in the comment handler (import pdb; pdb.set_trace()
) and that reveals that the handler is never being called! I haven't figured out why not yet...
I think the issue is in _subroutine_handler
:
https://github.com/stfc/PSyclone/blob/d87b76feb33a48369ea06be319493d8f64f1adb7/src/psyclone/psyir/frontend/fparser2.py#L4472-L4486
It's currently setup to skip over Comments. That needs removing. Also, the comment at 4472 should actually be at 4480.
Actually, it may not be that simple. Possibly we should search for Dummy_Arg_List
explicitly (using a _first_type_match
perhaps)?
If you change your test to use a Program instead of a Subroutine then it should work.
Thanks, used _first_type_match
in 303c6589e. My test is currently using a program, but still doesn't pass with this change I'm afraid.
Nothing's ever easy is it? I've taken a look and I see:
> /home/me/Projects/PSyclone/src/psyclone/psyir/frontend/fparser2.py(4706)_program_handler()
-> self.process_nodes(file_container, node.children)
(Pdb) node.children
[Main_Program(Program_Stmt('PROGRAM', Name('test_comment')), Specification_Part(Implicit_Part(Comment('! hi'))),
End_Program_Stmt('PROGRAM', Name('test_comment')))]
So, in this particular case, the Comment node happens to be inside the Implicit_Part
of the parse tree. I think this means that we'll have to allow for Comments in process_declarations
. However, for now, if you make sure your comment comes after an executable statement then hopefully it will work. e.g.:
'program test_comment\n'
'integer :: step\n'
'step = 10\n'
'! hi\n'
'end program test_comment'
works for me.
Ah nice, thanks so much @arporter! That works for me too and now I can get back on with fixing the other tests :)
@arporter @sergisiso This would be useful for NG-ARCH work for UKCA at least - I'll try to read through this issue at some point and see where we are
https://github.com/stfc/PSyclone/tree/1247_allow_comments - Current branch we can look at that Joe worked on.
We were talking about it this morning with @hbrunie which is also planning to look at his. I think this should be split it in 2 different steps:
First to enable "fparser with comments" as the input parse_tree that the psyclone frontend reads. This will keep the comment information in the PSyIR and allow later to start interpreting them in case they are directive.
The second step would be interpreting them (@hbrunie is interested in POSIDON, and this particular issue in ACC/OMP directives)
The linked branch has only commits behind master (not ahead). @arporter @jwallwork23 Is there another branch with this work? I see some commit reference above that point to a fork?
From a quick look at this issue, it seems like the hard bit of the first step is where fparser puts the comments (relative to where they might belong in the tree if they result in Statement
nodes)? (and maybe what happens if we just swap to ignore_comments=False
).
This ties in to making the second part hard as interpreting them and putting them in the correct part of the PSyIR tree is important and seemingly nontrivial.
Yes, this is the problem we were discussing this morning. Many parts of the fortran_reader look for hardcoded child positions, adding comments can easily break these assumptions. Maybe the first step can also be subdivided into:
I'm not sure if it would make sense or not (if it breaks more things) but could we modify fparser to always have a comments child (if appropriate) which could be empty? I don't really know how fparser fits together so this may be a bad modification.
I'm not sure if it would make sense or not (if it breaks more things) but could we modify fparser to always have a comments child (if appropriate) which could be empty? I don't really know how fparser fits together so this may be a bad modification.
I'm not sure this would help us very much (but I may not be understanding). I think the thing to do is to grab the branch and see what's broken. I could take a look if you like in order to get the ball rolling at least?
I was wondering as well - if the first step is essentially a full rewrite of fparser reader to work with comments (and generally more resilient) are we best to make that many PRs broken down by section? It feels like fparser2.py is moving quite regularly at the moment and it might be best to do piecewise to avoid a massive PR and constant merging issues?
Though maybe once routine symbol is finished fparser2.py might be stable for a while?
The linked branch has only commits behind master (not ahead). @arporter @jwallwork23 Is there another branch with this work? I see some commit reference above that point to a fork?
Yes, I think @jwallwork23 had taken a fork. I've checked my local versions of the repo and don't seem to have a record of it. Is it available somewhere Joe?
Unfortunately, I now recall that my attempts only raised more errors and warnings so I ended up resetting the branch with an intention to start again which I never got chance to do. Apologies about this.
There are several branchless commits referenced above that capture some of what I was trying to achieve (e.g., d80ee25272dee5c3edc3f020e7caa848803c4d5a) but PSyclone has likely moved on sufficiently far that it would be simpler to start from scratch than try to use these patches.
In fact, a first step would be simply to put such directives into CodeBlocks.
Originally posted by @arporter in https://github.com/stfc/PSyclone/issues/2275#issuecomment-1680694842