BerkeleyLab / flang-testing-project

The Berkeley Lab Flang team develops tests for the LLVM-Project Flang Fortran compiler. Because of the paramount importance of parallelism in high-performance computing, we are focusing on Fortran’s parallel features, commonly denoted "Coarray Fortran."
https://go.lbl.gov/flang-testing
Other
4 stars 3 forks source link

Ensure copy-in copy-out strategy for coarray arguments #85

Open everythingfunctional opened 1 year ago

everythingfunctional commented 1 year ago

I believe it would be a valid strategy to implement coindexed procedure actual arguments as copy-in/copy-out. This is because it would not be valid for anything to do anything which would be able to observe that the data hasn't be put where it belongs yet. We need to carefully validate that to be the case.

everythingfunctional commented 1 year ago

I went through the 2023 draft standard and found all the places that discuss argument association. Here are my notes from the relevant sections:

As far as I can tell, there is nothing that would prevent us from requiring copy-in/copy-out semantics for coindexed actual arguments.

everythingfunctional commented 1 year ago

In response to out discussions about the interfaces for get/put for coarray access, specifically as it relates to coarrays with allocatable components, consider something like the following, which might be a valid reference.

type :: t1
  integer, allocatable :: vals(:)
end type
type :: t2
  type(t1), allocatable :: vals(:)
end type
type(t2), allocatable :: stuff(:)[:]

...
stuff(1:5:2)[n]%vals(1:7:3)%vals(3:1:-1)

I need to go read closer to see if this is something that is actually allowed, but if it is could pose a problem.

everythingfunctional commented 1 year ago

Looking into it a bit more, it seems the complications are at least limited to a degree:

There shall not be more than one part-ref with nonzero rank. A part-name to the right of a part-ref with nonzero rank shall not have the ALLOCATABLE or POINTER attribute.

So the most complicated it could get is something like

type :: t1
  integer :: vals(10)
end type
type :: t2
  type(t1) :: vals(15)
end type
type(t2), allocatable :: stuff(:)[:]

...
stuff(1:5:2)[n]%vals(2)%vals(3)

But I think that's still going to be well defined in terms of memory layout so the descriptor and coarray "handle" should still be sufficient.

everythingfunctional commented 1 year ago

I think this one still requires an extra operation to resolve the data location on the remote image though.

type :: t1
  integer :: vals(10)
end type
type :: t2
  type(t1), allocatable :: vals(:)
end type
type(t2), allocatable :: stuff[:]

...
stuff[n]%vals(2:10:2)%vals(3)
everythingfunctional commented 1 year ago

Just noticed this Note in the standard:

If the actual argument is a coindexed object, a processor that uses distributed memory might create a copy on the executing image of the actual argument, including copies of any allocated allocatable subobjects, and associate the dummy argument with that copy. If necessary, on return from the procedure, the value of the copy would be copied back to the actual argument.

everythingfunctional commented 1 year ago

Found another relevant statement in Section 11.6.2 Segments:

if a procedure invocation on image P is in execution in segments Pi , Pi+1, . . . , Pk and defines a noncoarray dummy argument, the effective argument shall not be referenced, defined, or become undefined on another image Q in a segment Qj unless Qj precedes Pi or succeeds Pk .