stfc / PSyclone

Domain-specific compiler and code transformation system for Finite Difference/Volume/Element Earth-system models in Fortran
BSD 3-Clause "New" or "Revised" License
106 stars 28 forks source link

Add support for querying the data type of a PSyIR expression. #1799

Open arporter opened 2 years ago

arporter commented 2 years ago

From #1696:

You maybe already hinted it here:

It may be that we really want the routine (rank_of_subsection) to give us back the shape of any subsection

But this recursion method to find the slice rank is somewhat similar to what @rupertford was planning to implement in order to 'bubble up' the datatypes too all PSyIR expression (datanodes) nodes. I can't find the specific issue but we had a chat during Easter week and Rupert was about to started implementing it. (starting with operations but accessors must also be part of it for a complete solution)

Given that this .datatype property for all expression nodes will be more generic (and would avoid having the slightly confusing "rank_of_subsection" in nodes that have nothing to do with arrays like member and structurereference) I think the methods you add in this PR could at some point be superseded by .datatype and simply be:

rank_slice = len(top_node.datatype.shape)

Note that for a reference a(3,:)

ref.symbol.datatype is ArrayType([3, Extent.ATTRIBUTE])

but

ref.datatype is ArrayType([Extent.ATTRIBUTE])

which is the information you need.

arporter commented 2 years ago

In #1696 I was having to add TODOs that would be fixed by this so I thought I'd take a look at doing it this way. I have datatype implemented for all of the Reference subclasses now and it's pretty straightforward. Feels much better than what I was doing in #1696.

arporter commented 2 years ago

In #1782 I'd like to compare the datatypes of two symbols for equality but that doesn't currently work. I can fix this as part of this issue.

sergisiso commented 2 years ago

My comments regarding the initial "towards" PR of this issue:

arporter commented 1 year ago

Currently, datatype for an ArrayMixin ignores the lower bound of each dimension (implicitly assuming it is unity). However, in #1973 I've realised that the lower bound is actually important (since Fortran allows it to be specified and that for a dummy argument can differ from the actual argument). I'm therefore wondering whether the shape returned by datatype should actually include bounds information rather than just extent? What do @sergisiso, @rupertford and @hiker think?

rupertford commented 1 year ago

I think that shape also supports a tuple of values that allows both lower and upper bounds to be set.

arporter commented 1 year ago

I could set the lower bound but the hard part is what to do with non-contiguous array slices, e.g. what is the datatype of a(3:9:3)? Currently datatype says that this is a 1D array of extent (9-3)/3 + 1 which implicitly starts at index 1. The alternative is to have it defined by the original Range instead but that doesn't seem right for a type: essentially, the Range expression gets evaluated and the result is a 1D array constructed by gathering the necessary values. Perhaps my lower-bound problems should be handled in a separate method?

sergisiso commented 1 year ago

The resulting shape of a(3:9:3) node.datatype.shape is indeed "a 1D array of extent (9-3)/3 + 1 which implicitly starts at index 1." You can get the lower bound of "a" with node.symbol.datatype.shape.lower, but this is not transferred to where the result of the accessor in bond after the accessor. That lower bound will depend on the declaration of the store location.

Can you add an example of the caller-callee argument that you are having problem with? I think when inlining with non unity lower bound in some cases we will need to update with all the accessor ranges (or do some trick with a temporary pointer array?)

sergisiso commented 1 year ago

or forbid inlining such routines for now.

arporter commented 1 year ago

I don't really have a problem (in the sense that I have it all working now), e.g.:

 real, dimension(2:8) :: actual
call my_sub(actual)

subroutine my_sub(x)
  real, dimension(3:) :: x
  x(3) = 5.0
end subroutine my_sub

Inlining the call should give actual(2) = 5.0 and it does :-) My concern was really where to put the extra functionality I've added to look-up the lower bound (non-trivial for an ArrayMember) but I now think it's OK to have a new, separate method for that.

sergisiso commented 1 year ago

Regarding the last comments on this issue, in #1973 @arporter has introduced querying the array lower bound in ArrayMixin that choses between node.symbol.datatype.shape.lower or an LBOUND expression depending on the type information that we have.

To close this issue we probably should make a symmetric one for upper bound and use them (e.g. in array2range transformations we have worse ad-hoc implementations of the same functionality)

arporter commented 1 year ago

We need this functionality in order to get useful coverage of the use of ASSOCIATE in NEMOVAR. We should maybe have a go at prioritising Issues once Sergi is back.

arporter commented 1 year ago

Currently, Reference has a datatype. We need to extend this to:

arporter commented 1 year ago

I'm thinking about Operations first. Everything seems fairly straightforward unless the operands are of the same intrinsic type (e.g. real) but of different kinds. The standard says: image which is a bit circular (from my point of view) because it talks about the type and kind of the result. I suppose the key point is that an expression never exists in isolation - its result is always used in some way (e.g. assigned to a variable or passed as an argument to a routine) and it's that use that defines the type. I suspect that for our purposes we won't often face operands that differ only in kind so we could raise an exception in such cases.

sergisiso commented 10 months ago

A datatype property on all DataNodes and support for operations was added with #2387. This means that we can start to use/depend on this functionality. But we still need better support for Call/IntrinsicCall to close this issue.