j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
175 stars 14 forks source link

Finalization of class arrays; interpretation of standards #231

Open abensonca opened 2 years ago

abensonca commented 2 years ago

I'm working (with much help from Paul Thomas) on adding finalization on intrinsic assignment to gfortran. There are a few instances where I'm unclear on precisely what is required by the standard, so would be very grateful for any insight any one here can offer.

This example considers finalization of a class array:

module testmode
  implicit none

  character(4) :: scope = "MAIN"

  logical, parameter :: instrument = .false.

  type :: simple
    character(4) :: scope
    integer :: ind
  contains
    final :: destructor1, destructor2
  end type simple

  type, extends(simple) :: complicated
    real :: rind
  contains
    final :: destructor3, destructor4
  end type complicated

  integer :: check_scalar
  integer :: check_array(4)
  real :: check_real
  real :: check_rarray(4)
  integer :: final_count = 0

contains

  subroutine destructor1(self)
    type(simple), intent(inout) :: self
    print *, "destructor1(", self%scope, ") ", self%ind
  end subroutine destructor1

  subroutine destructor2(self)
    type(simple), intent(inout) :: self(:)
    print *, "destructor2(", self(1)%scope, ") ", self%ind
  end subroutine destructor2

  subroutine destructor3(self)
    type(complicated), intent(inout) :: self
    print *, "destructor3(", self%scope, ") ", self%rind
  end subroutine destructor3

  subroutine destructor4(self)
    type(complicated), intent(inout) :: self(:)
    if (size(self, 1) .gt. 0) then
      print *, "destructor4(", self(1)%scope, ") ", size(self%rind), self%rind
    else
      print *, "destructor4"
    end if
  end subroutine destructor4

  function constructor1(ind) result(res)
    type(simple), allocatable :: res
    integer, intent(in) :: ind
    scope = "CTR1"
    allocate (res, source = simple ("SOUR", ind))
    res%scope = scope
  end function constructor1

  function constructor2(ind, rind) result(res)
    class(simple), allocatable :: res(:)
    integer, intent(in) :: ind(:)
    real, intent(in), optional :: rind(:)
    type(complicated), allocatable :: src(:)
    integer :: sz
    integer :: i
    scope = "CTR2"
    if (present (rind)) then
      sz = min (size (ind, 1), size (rind, 1))
      src  = [(complicated ("SOUR", ind(i), rind(i)), i = 1, sz)]
      allocate (res, source = src)
      src%scope = "SRC "
      res%scope=scope
    else
      sz = size (ind, 1)
      allocate (res, source = [(simple (scope, ind(i)), i = 1, sz)])
    end if
  end function constructor2
end module testmode

program test_final
  use testmode
  implicit none

  class(simple), allocatable :: MyClassArray(:)

! *****************
! Class assignments
! *****************

  allocate (MyClassArray, source = [complicated(scope, 1, 2.0),complicated(scope, 3, 4.0)])
  print *, "[3] ...until here. Both call the rank-1 finalizer for the extended &
            type but ifort calls the rank-0 finalizer for the parent type, while &
            gfortran uses the rank-1 finalizer."
  deallocate (MyClassArray)
end program test_final

With gfortran (including the patches for finalization on intrinsic assignment that I'm working on), this results in:

 [3] ...until here. Both call the rank-1 finalizer for the extended type but ifort calls the rank-0 finalizer for the parent type, while gfortran uses the rank-1 finalizer.
 destructor4(MAIN)            2   2.00000000       4.00000000    
 destructor2(MAIN)            1           3

which shows that, when deallocating 'MyClassArray', the rank-1 finalizer for the extended type 'complicated' is called, and then the rank-1 finalizer for the parent type 'simple' is called.

But, under ifort I get:

 [3] ...until here. Both call the rank-1 finalizer for the extended type but ifo
 rt calls the rank-0 finalizer for the parent type, while gfortran uses the rank
 -1 finalizer.
 destructor4(MAIN)            2   2.000000       4.000000    
 destructor1(MAIN)            1
 destructor1(MAIN)            3

showing that the rank-1 finalizer is called for the extended type, but then the scalar finalizer of the parent type is called twice, once for each element in the array.

ifort's behavior seems incorrect here (based on my reading of the F2018 standards), but I'd be interested to hear anyone's opinion on this.

Thanks, Andrew

certik commented 2 years ago

Hi @abensonca thanks for your work on gfortran and for asking here. Is your question the same as #146, or different?

abensonca commented 2 years ago

Hi @certik - I read through #146 and I don't think my question here is the same issue. In this case I'm not concerned about the order of the finalization. Instead my concern is whether it is ever correct to call a non elemental scalar finalizer subroutine on elements of a rank-1 derived-type array? The Intel compiler is doing this - it calls a rank-1 finalizer on the extended type of the array, but then calls a (non-elemental) scalar finalizer on the parent type of each element of the array. This seems wrong to me.

nncarlson commented 2 years ago

I think Intel's behavior is incorrect as well. The process described in paragraph 7.5.6.2 (3) essentially recurses to (1), and I'm not aware of any other place in the standard that speaks to this.

I compiled your example with the NAG 7.0 compiler and it behaves like you've described for gfortran.

klausler commented 2 years ago

@abensonca (repeating what I wrote to you on Discourse) - the answer is that no, a non-ELEMENTAL final subroutine with a scalar argument should not be called elementally, and so ifort is broken. (And an ELEMENTAL final subroutine should be called only when there's no other final subroutine with a matching rank.)

This applies at each level of typing, so if a parent type has (say) a rank-1 final subroutine, and an extension of that derived type has only an ELEMENTAL final subroutine, then a vector of the extended type will be finalized by calling the extended type's final subroutine for each element, and then the parent's final procedure will be called (once) for the parent component of the vector.

abensonca commented 2 years ago

Thanks @nncarlson and @klausler for your help on this.

FortranFan commented 2 years ago

@abensonca wrote Sep 27, 2021 4:33 PM EDT:

ifort's behavior seems incorrect here (based on my reading of the F2018 standards), but I'd be interested to hear anyone's opinion on this.

For whatever it's worth and I'm open to being proved wrong on this: I think Intel Fortran's finalization process toward the code in the original post vis-a-vis the standard is entirely acceptable.

Note the standard is (purposefully I think) not prescriptive when it comes to the finalization process, there is a desired end state of a finalized entity in the standard even as it is not clearly spelled out. Nonetheless the standard effectively permits the processors multiple pathways to arrive at that end state. Comparison of those pathways, which is what the original post attempts, is not particularly meaningful in this context, at least as the standard is currently written.

Section 7.5.6.2 The finalization process in 18-007r1 document toward the 2018 standard by and large gives a lot of leeway to the processor and there are no numbered rules or constraints in this section enforcing the program or processor behavior.

Taking into consideration the first paragraph in section 7.5.6.2 and the numbered bullets therein in conjunction with section 7.5.7 on type extension, a fair argument can be made the finalization process in the following code is analogous to the one in the original post:

module m
   type :: a_t
   contains
      final :: f_a_r0
   end type
   type :: b_t
      type(a_t) :: a
   contains
      final :: f_b_r0, f_b_r1
   end type 
contains
   subroutine f_a_r0( a )
      type(a_t), intent(inout) :: a
      print *, "finalizer rank-0 a_t"
   end subroutine 
   subroutine f_b_r0( b )
      type(b_t), intent(inout) :: b
      print *, "finalizer rank-0 b_t"
   end subroutine 
   subroutine f_b_r1( b )
      type(b_t), intent(inout) :: b(:)
      print *, "finalizer rank-1 b_t"
   end subroutine 
end module
   use m
   block
      type(b_t) :: foo(3)
   end block
end 

And for this, a standard-conforming processor can be expected to yield the following program behavior:

C:\Temp>ifort /standard-semantics f.f90 Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.3.0 Build 20210609_000000 Copyright (C) 1985-2021 Intel Corporation. All rights reserved.

Microsoft (R) Incremental Linker Version 14.29.30038.1 Copyright (C) Microsoft Corporation. All rights reserved.

-out:f.exe -subsystem:console f.obj

C:\Temp>f.exe finalizer rank-1 b_t finalizer rank-0 a_t finalizer rank-0 a_t finalizer rank-0 a_t

as does Intel Fortran and NAG Fortran (though not gfortran**).

Ostensibly the 2 processors in this case simply follow the steps in section 7.5.6.2 and there is no argument to be made about any processor nonconformance with the simple code here. The key sentences in the standard are "If the entity being finalized is an array, each finalizable component of each element of that entity is finalized separately" followed by "If the entity is of extended type and the parent type is finalizable, the parent component is finalized"

And what Intel Fortran does with the code in the original post is consistent with this and I think it is conformant with the standard.

** gfortran has gaps when it comes to finalization of nonallocatable but finalizable objects, the code here is an example of this. @abensonca, you may be interested in including this case in your work.

abensonca commented 2 years ago

@FortranFan That does seem like a reasonable interpretation of the standard. And thanks for pointing out the limitation of gfortran with finalizing non-allocatable objects - I'll add this to my test cases and see if it's possible to fix this behavior.

abensonca commented 2 years ago

A follow-on to the above case:

module testmode
  implicit none

  character(4) :: scope = "MAIN"

  logical, parameter :: instrument = .false.

  type :: simple
    character(4) :: scope
    integer :: ind
  contains
    final :: destructor1, destructor2
  end type simple

  type, extends(simple) :: complicated
    real :: rind
  contains
    final :: destructor3, destructor4
  end type complicated

  integer :: check_scalar
  integer :: check_array(4)
  real :: check_real
  real :: check_rarray(4)
  integer :: final_count = 0

contains

  subroutine destructor1(self)
    type(simple), intent(inout) :: self
    print *, "destructor1(", self%scope, ") ", self%ind
  end subroutine destructor1

  subroutine destructor2(self)
    type(simple), intent(inout) :: self(:)
    print *, "destructor2(", self(1)%scope, ") ", self%ind
  end subroutine destructor2

  subroutine destructor3(self)
    type(complicated), intent(inout) :: self
    print *, "destructor3(", self%scope, ") ", self%rind
  end subroutine destructor3

  subroutine destructor4(self)
    type(complicated), intent(inout) :: self(:)
    if (size(self, 1) .gt. 0) then
      print *, "destructor4(", self(1)%scope, ") ", size(self%rind), self%rind
    else
      print *, "destructor4"
    end if
  end subroutine destructor4

  function constructor1(ind) result(res)
    type(simple), allocatable :: res
    integer, intent(in) :: ind
    scope = "CTR1"
    allocate (res, source = simple ("SOUR", ind))
    res%scope = scope
  end function constructor1

  function constructor2(ind, rind) result(res)
    class(simple), allocatable :: res(:)
    integer, intent(in) :: ind(:)
    real, intent(in), optional :: rind(:)
    type(complicated), allocatable :: src(:)
    integer :: sz
    integer :: i
    scope = "CTR2"
    if (present (rind)) then
      sz = min (size (ind, 1), size (rind, 1))
      src  = [(complicated ("SOUR", ind(i), rind(i)), i = 1, sz)]
      allocate (res, source = src)
      src%scope = "SRC "
      res%scope=scope
    else
      sz = size (ind, 1)
      allocate (res, source = [(simple (scope, ind(i)), i = 1, sz)])
    end if
  end function constructor2
end module testmode

program test_final
  use testmode
  implicit none

  type(simple), allocatable :: MyType, MyType2
  type(simple), allocatable :: MyTypeArray(:)
  type(simple) :: ThyType = simple("MAIN", 21), ThyType2 = simple("MAIN", 22)
  class(simple), allocatable :: MyClass
  class(simple), allocatable :: MyClassArray(:)

! *****************
! Class assignments
! *****************

  allocate (MyClassArray, source = [complicated(scope, 1, 2.0),complicated(scope, 3, 4.0)])
  print *, "[3] ...esuntil here. Both call the rank-1 finalizer for the extended &
            type but ifort calls the rank-0 finalizer for the parent type, while &
            gfortran uses the rank-1 finalizer."
  deallocate (MyClassArray)
  print *, "After deallocation of MyClassArray."
  print *, "Now MyClassArray = constructor2 ([10,20], [10.0,20.0])"
  print *, "ifort continues to use the rank-1 finalizer for the parent type."
  print *, "[4] Both call extended + parent finalizers twice but, the second time &
       ifort shows the component 'rind' as having size 0"
  MyClassArray = constructor2 ([10,20], [10.0,20.0])

end program test_final

The modified gfortran I'm working on outputs:

 [3] ...esuntil here. Both call the rank-1 finalizer for the extended type but ifort calls the rank-0 finalizer for the parent type, while gfortran uses the rank-1 finalizer.
 destructor4(MAIN)            2   2.00000000       4.00000000    
 destructor2(MAIN)            1           3
 After deallocation of MyClassArray.
 Now MyClassArray = constructor2 ([10,20], [10.0,20.0])
 ifort continues to use the rank-1 finalizer for the parent type.
 [4] Both call extended + parent finalizers twice but, the second time ifort shows the component 'rind' as having size 0
 destructor4(SRC )            2   10.0000000       20.0000000    
 destructor2(SRC )           10          20
 destructor4(CTR2)            2   10.0000000       20.0000000    
 destructor2(CTR2)           10          20

while ifort outputs:

 [3] ...esuntil here. Both call the rank-1 finalizer for the extended type but i
 fort calls the rank-0 finalizer for the parent type, while gfortran uses the ra
 nk-1 finalizer.
 destructor4(MAIN)            2   2.000000       4.000000    
 destructor1(MAIN)            1
 destructor1(MAIN)            3
 After deallocation of MyClassArray.
 Now MyClassArray = constructor2 ([10,20], [10.0,20.0])
 ifort continues to use the rank-1 finalizer for the parent type.
 [4] Both call extended + parent finalizers twice but, the second time ifort sho
 ws the component 'rind' as having size 0
 destructor4(SRC )            2   10.00000       20.00000    
 destructor1(SRC )           10
 destructor1(SRC )           20
 destructor4
 destructor1(CTR2)           10
 destructor1(CTR2)           20

The last statement in this example assigns to MyClassArray. As stated in the output, both gfortran and ifort call the extended & parent type finalizers twice (once for the src object used in constructor2(), and once for the object returned by constructor2(). But, for ifort, for the second call to the finalizer ifort shows the component rind as having zero size (and valgrind shows some "invalid read" errors here).

ifort's behavior seems broken here, but I'd be interested and grateful if anyone can show results for this code with NAG or other compilers for comparison.