Constructing Objects With Polymorphic Components Not Possible in Pure Context?

everythingfunctional commented 3 years ago

Given the following constraints from the standard:

C1585 The function result of a pure function shall not be both polymorphic and allocatable, or have a polymorphic allocatable ultimate component. C1588 An INTENT (OUT) dummy argument of a pure procedure shall not be polymorphic or have a polymorphic allocatable ultimate component.

it seems it would not be possible to write (let alone call in a pure context) a constructor for an object with a polymorphic component that is pure.

To my mind this eliminates an entire class of data structures that one would find useful (and expect to be useable) in a pure context. For example, a simple binary tree of integers might be expected to be implemented like the following.

module tree_m

implicit none
private
public :: tree_t, node, leaf

type, abstract :: tree_t
contains
  procedure(sum_i), deferred :: sum
end type

abstract interface
  pure function sum_i(self) result(sum)
    class(tree_t), intent(in) :: self
    integer :: sum
  end function
end interface

type, extends(tree_t) :: node_t
  private
  class(tree_t), allocatable :: left, right
contains
  procedure :: sum => node_sum
end type

type, extends(tree_t) :: leaf_t
  private
  integer :: val
contains
  procedure :: sum => leaf_sum
end type

contains

pure function node_constructor(left, right) result(node) ! pure not allowed
  class(tree_t), intent(in) :: left, right
  type(node_t) :: node

  allocate(node%left, source = left)
  allocate(node%right, source = right)
end function

pure function leaf_constructor(val) result(leaf)
  integer, intent(in) :: val
  type(leaf_t) :: leaf

  leaf%val = val
end function

pure function node_sum(self) result(sum)
  class(node_t), intent(in) :: self
  integer :: sum

  sum = self%left%sum() + self%right%sum()
end function

pure function leaf_sum(self) result(sum)
  class(leaf_t), intent(in) :: self
  integer :: sum

  sum = self%val
end function

end module

and then be able to use it like the following in a pure context.

class(tree_t), allocatable :: local_tree
integer :: total

allocate(tree, source = node(node(leaf(1), leaf(2)), leaf(3)))
total = tree%sum()

But, even though the node constructor

does not modify its inputs
does not reference (let alone modify) any other entities
does not perform any IO it still cannot be made pure due to the above constraints.

Does anyone know why (or if) these constraints are necessary? It seems most compilers are not enforcing them.

FortranFan commented 3 years ago

@everythingfunctional ,

Perhaps someone who was present during meetings 207 and 212 can provide you detailed feedback, but I think the following couple of papers can give you a bit of insight into the thought process by J3 committee that led to the tightening up the constraints in Fortran 2018 standard revision compared to Fortran 2003 which was more permissive in this regard: https://j3-fortran.org/doc/year/17/17-130r1.txt https://j3-fortran.org/doc/year/15/15-211.txt

You will notice the notion of a PURE procedure and the avoidance of side effects, as viewed in the context of Fortran, extends to a polymorphic entity to be not deallocated and to exclude the possibility where an impure finalizer may need to be invoked in such a context. The two constraints you mention are then natural consequences of such requirements.

everythingfunctional commented 3 years ago

@FortranFan , thanks for the additional reading.

My understanding so far is that, because it's possible that some type that extends from the polymorphic entity might implement an impure finalizer, it's impossible to know whether the use of a polymorphic entity will definitely be pure.

Would it be worth a proposal for a new annotation for types that states something to the effect of:

This type (and any types which derive from it) do not have impure finalizers

This, I think, would allow such types to safely be used as polymorphic entities in a pure context.

Also, since function results and intent(out) arguments are by definition unallocated on entry to the procedure, it's possible (likely even) that no deallocation or finalization would actually occur. One would need to explicitly assign to the entity twice, or explicitly deallocate for finalization to happen, which the compiler could in theory detect. However, calling a function with polymorphic result would require the result be deallocated and possibly finalized after the function call, as the result is just a temporary entity.

This brings to mind another question. Is there something in the standard that requires a temporary, an assignment and a deallocation for a statement like the following, or would the compiler be allowed to optimize that away?

polymorphic_variable = polymorphic_function(with, args)

But that question probably deserves its own thread.

klausler commented 3 years ago

You are correct; the prohibitions are meant to make it impossible for an impure final procedure to be called from a pure procedure.

There's a lot of ways in which final procedures are badly designed in Fortran.

everythingfunctional commented 3 years ago

I had another thought as well. What are the chances the following invokes an impure defined assignment?

polymorhpic_intent_out = polymorphic_intent_in

In which case I would need to add to my above annotation to be

This type (and any types which derive from it) do not have impure finalizers or impure defined assignment

FortranFan commented 3 years ago

@everythingfunctional wrote Dec. 22, 2020 10:44 AM EST:

.. What are the chances the following invokes an impure defined assignment? ..

Would you have access to NAG Fortran compiler? It'll be interesting to try out such an assignment in a test reproducer and see the processor response. I say this because the standard appears to disallow this but 2 compilers I tried fail to issue any diagnostic.

FortranFan commented 3 years ago

@everythingfunctional wrote in the original post on Dec. 21, 2020 5:53 PM EST:

.. Does anyone know why (or if) these constraints are necessary? It seems most compilers are not enforcing them.

Re: "It seems most compilers are not enforcing them," it's possible I'm wrong about this, but it seems to me in the case of Intel Fortran and gfortran, the compiler implementation of the relevant semantics (and syntax) mostly occurred during the fairly long period between Fortran 2003 and 2018 standard revisions and the processors haven't fully caught up yet to the changes in Fortran 2018 relative to 2003 when it comes to PURE procedures.

FortranFan commented 3 years ago

@everythingfunctional wrote in the original post on Dec. 21, 2020 5:53 PM EST:

.. To my mind this eliminates an entire class of data structures that one would find useful (and expect to be useable) in a pure context. ..

Please note though it will take a significant effort to illustrate whether the current standard really eliminates such data structures or whether the lack of adequate generics facility in the language forces one to resort to polymorphism as a poorer substitute for parameterization and generalization of data structures (and associated algorithms) resulting in bottlenecks due to the constraints mentioned in the original post.

Note there will be considerable push back on any further changes to the standard on account of the possibility of alternate solutions albeit specialized approaches - see a modified and simple (not optimized) example using integer data below . Some (or many) folks may view the possible alternates using standard features as adequate for the language until the facility for generics is established.

module node_m
! Node "class" toward a binary tree data structure
   type :: node_t
      private
      integer :: m_dat = 0  !<== Note type of "data" in a node can be parameterized
      type(node_t), allocatable :: m_left
      type(node_t), allocatable :: m_right
   contains
      procedure, pass(node) :: reduce => reduce_node
      procedure, pass(node) :: insert => insert_dat
   end type
   interface node_t
      module procedure newnode
   end interface
contains
   elemental function newnode( dat ) result( node )
      ! Argument list
      integer, intent(in) :: dat
      type(node_t) :: node
      node%m_dat = dat
      return
   end function
   elemental subroutine insert_dat( node, dat )
      ! Argument list
      class(node_t), intent(inout) :: node
      integer, intent(in) :: dat
      if ( dat <= node%m_dat ) then
         if ( .not. allocated(node%m_left) ) then
            node%m_left = newnode( dat )
         else
            call insert_dat( node%m_left, dat )
         end if
      else
         if ( .not. allocated(node%m_right) ) then
            node%m_right = newnode( dat )
         else
            call insert_dat( node%m_right, dat )
         end if
      end if
   end subroutine
   elemental function reduce_node( node ) result( r )
      ! Argument list
      class(node_t), intent(in) :: node
      integer :: r
      r = node%m_dat
      if ( allocated(node%m_left) ) r = r + reduce_node(node%m_left)
      if ( allocated(node%m_right) ) r = r + reduce_node(node%m_right)
      return
   end function
end module

module tree_m
! Tree "class" to encapsulate the nodes in it
   use node_m, only : node_t
   type :: tree_t
      private
      type(node_t), allocatable :: m_root
   contains
      procedure, pass(this) :: insert => insert_node
      procedure, pass(this) :: reduce => reduce_tree
   end type
contains
   elemental subroutine insert_node( this, dat )
      ! Argument list
      class(tree_t), intent(inout) :: this
      integer, intent(in) :: dat
      if (.not. allocated(this%m_root) ) then
         this%m_root = node_t( dat )
         return
      end if
      call this%m_root%insert( dat )
   end subroutine
   elemental function reduce_tree( this ) result( r )
      ! Argument list
      class(tree_t), intent(in) :: this
      integer :: r
      r = 0
      if ( allocated(this%m_root) ) r = this%m_root%reduce()
      return
   end function
end module

! Calling program
   use tree_m, only : tree_t
   type(tree_t) :: tree
   integer, allocatable :: dat(:)
   integer :: i
   dat = [ 14, 4, 13, 8, 3, 10, 6, 17, 1 ]
   do i = 1, size(dat)
      call tree%insert( dat(i) )
   end do
   print *, "reduce(tree): ", tree%reduce(), "; expected is ", sum(dat)
end

Upon execution using Intel Fortran,

C:\Temp>ifort /standard-semantics /warn:all /stand:f18 p.f90 Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.1 Build 20201112_000000 Copyright (C) 1985-2020 Intel Corporation. All rights reserved.

Microsoft (R) Incremental Linker Version 14.26.28806.0 Copyright (C) Microsoft Corporation. All rights reserved.

-out:p.exe -subsystem:console p.obj

C:\Temp>p.exe reduce(tree): 76 ; expected is 76

C:\Temp>

everythingfunctional commented 3 years ago

@everythingfunctional wrote Dec. 22, 2020 10:44 AM EST:

.. What are the chances the following invokes an impure defined assignment? ..

Would you have access to NAG Fortran compiler? It'll be interesting to try out such an assignment in a test reproducer and see the processor response. I say this because the standard appears to disallow this but 2 compilers I tried fail to issue any diagnostic.

This was actually how I found out I was already doing this (i.e. violating the standard). Neither gfortran nor Intel complained. However, neither the standard nor NAG mention the reason for the constraint. There are a few other constraints near those that mention finalization, but it's not clear that they are directly related, and there is no mention of defined assignment anywhere.

My conclusion is that the semantics of the language (i.e. user defined assignment and finalization procedures) allow the possibility of sneaking impure code into places a naive reading would not expect it to be possible. Thus, even for generics, we will need to take care to close these loopholes to be able to mark procedures pure.

For example if you were to replace the integer in your tree with a type parameter, unless the constraints for that type parameter are defined as not allowing impure defined assignment or finalization, you could still slip some impure code in there.

FortranFan commented 3 years ago

@everythingfunctional wrote Dec. 23, 2020 10:14 AM EST:

.. My conclusion is that the semantics of the language (i.e. user defined assignment and finalization procedures) allow the possibility of sneaking impure code into places a naive reading would not expect it to be possible. ..

For example if you were to replace the integer in your tree with a type parameter, unless the constraints for that type parameter are defined as not allowing impure defined assignment or finalization, you could still slip some impure code in there.

It seems to me in the context of user-defined data structures and algorithms operating with them, constraint C1595 technically shuts such doors to impurity:

"Any procedure referenced in a pure subprogram, including one referenced via a defined operation, defined assignment, defined input/output, or finalization, shall be pure."

The issue is processor support toward this and other such constraints, they're still lacking. I just filed several support requests with Intel toward improved support in IFORT compiler.

klausler commented 3 years ago

Whether defined assignment occurs for a given assignment-stmt is known at compilation time -- the test is in terms of the declared types of the variable and expression., not their dynamic types -- although the identity of the subroutine that implements the defined assignment can be overridden in the type-bound case. The defined assignment subroutine must be pure if the defined assignment takes place in a pure context; and any override of a pure subroutine must also be pure. So there's no hole here that I can see as an implementor.

If you're looking for weirdness with assignments in pure subprograms, try this one: it's not valid to store through a pointer dummy argument to a pure subprogram, even when the pointer has been allocated in the subprogram.

subroutine bad(p)
  real, pointer :: p
  allocate(p)
  p = 3.14159
end subroutine

is not conforming, but is legal if rewritten to be

subroutine good(p)
  real, pointer :: p
  real, pointer :: q
  allocate(q)
  q = 3.14159
  p => q
end subroutine

This pattern was caught by f18 in Whizard sources, and it's a pain, but those are the rules in Fortran 2018. (IMO the rule about pointer dummy arguments should not apply when they have INTENT(OUT).)

everythingfunctional commented 3 years ago

the test is in terms of the declared types of the variable and expression

I see. I was under the impression that it might do run-time look up based on the actual type, but I suppose that might fail if the LHS has not yet been allocated. Thanks for the clarification.

klausler commented 3 years ago

the test is in terms of the declared types of the variable and expression

I see. I was under the impression that it might do run-time look up based on the actual type, but I suppose that might fail if the LHS has not yet been allocated. Thanks for the clarification.

We do a look-up, yes, but only to look for overrides in the case where it's clear that a type-bound assignment must be called and the variable and expression are not both monomorphic..

plevold commented 1 year ago

(This was originally posted as #287. Re-posting here as there's no need to have to open issues about the same.

The standard currently says the following about pure subroutines:

This means that the code below is not valid Fortran:

pure subroutine sub(x)
    class(base_t), intent(out) :: x
end subroutine

The intent of this restriction is to avoid impure code being executed as a part of the pure subroutine. However, as this example demonstrates it is not sufficient a restriction to avoid impure code execution inside a pure subroutine:

module a_mod
    implicit none

    private
    public base_t
    public a_t

    type, abstract :: base_t
    contains
    end type

    type, extends(base_t) :: a_t
    contains
        final :: finalizer
    end type

contains

    subroutine finalizer(this)
        type(a_t), intent(in) :: this
        write(*,*) 'Impure finalizer invoked'
    end subroutine
end module

program main
    use a_mod, only: base_t, a_t

    class(base_t), allocatable :: x

    x = a_t()
    write(*,*) 'Before pure sub'
    call pure_sub(x)
    write(*,*) 'After pure sub'

contains

    pure subroutine pure_sub(x)
        class(base_t), allocatable, intent(inout) :: x

        if (allocated(x)) deallocate(x)
    end subroutine
end program

As this restriction does not fulfill its purpose I think it should be removed. Polymorphic arguments with intent(out) is also very useful and it would be good to be able to use them in pure subroutines.

A possible solution:

Add new syntax to abstract types for requiring that finalizers of any extending types must be pure. Could for example be something like this:
```
type, abstract, final(pure) :: base_t
end type
```
Require that any polymorphic types used in such a way that a finalizer might be invoked inside a pure procedure must have the above mentioned attribute. This includes
- When used as local variables
- When used as allocatable dummy arguments with intent(inout)
- When used as dummy arguments with intent(out)
- Possibly some more situations?

klausler commented 1 year ago

Refining this a bit, I think it would make more sense to not restrict the new syntax to abstract types, and to also allow for explicitly declaring that a type's extensions (1) don't exist, or (2) have no final subroutines, or (3) only pure final subroutines, or (4) at most a single pure elemental final subroutine.

(That last option would be useful in avoiding a common error with final subroutines in which one writes a default non-elemental final subroutine for a type and then of course it doesn't get called for arrays.)

And then of course a derived type that has no impure final subroutines and whose extensions either can't exist or can't have an impure final subroutine would become exempt from most constraints and restrictions currently imposed on objects with limited polymorphic types.

j3-fortran / fortran_proposals

Constructing Objects With Polymorphic Components Not Possible in Pure Context? #189