Generics as type-level metaprogamming

j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee

175 stars 14 forks source link

Generics as type-level metaprogamming #293

Open tkoenig1 opened 1 year ago

tkoenig1 commented 1 year ago

Most proposals for introducing templates to Fortran have closely followed C++, either directly in syntax or at least in spirit. For Fortran, though, a more Fortran-esque approach seems in order. The following proposal (though, given the complexity of the topic, not yet fully formulated) tries to do that.

The idea is to understand templates not as types but as type-level functions: They can take types as arguments, and return the wanted type. This suggests defining a special kind of functions (using a syntax very similiar to the normal Fortran function syntax) that is executed at compile time and can take types as arguments and return them. For this purpose, let us give the existing keyword generic an additional meaning: it used in the place where function or subroutine is normally specified. Furthermore, the keyword type without parantheses is used as the new kind of variable or dummy argument, which allows passing types to this new function type.

Anything between the generic statement and the contains statement would be executed at compile-time, and only compile-time constants are permitted there.

An example:

program main
  implicit none
  type(pair(integer, real)):: p
  p%a = 1
  p%b = 3.141
contains
  generic pair (t, s)
    type:: pair
    type:: t, s
    type res
      type(t):: a
      type(s):: b
    end type

    pair = res
  end pair
end program

which should be fairly readable to anybody familiar with Fortran syntax. The assignment statement pair = res works like it would in a function, where the result (in this case) the type is returned. So, the code would be translated to an equivalent of

  program main
  implicit none
  type pair_integer_real
    integer :: a
    real :: b
  end type pair_integer_real
  p%a = 1
  p%b = 3.141
end program main

As control flow is allowed inside these generics, all the features traditionally required from templates become trivial. Specialisation is just an if-statement, for example.

The mechanism allows for much more powerful usage but without the complexity and unreadability of template metaprogramming code, so infamous among C++-developers.

For this, two new special types classes need to be introduced: a function type class and a subroutine type class. In these, each function (or subroutine) is associated with its own specific type. For example, a function my_function() would be associated with a type my_function_type (though, in reality, it would usually not be given an explicit name), which can be implicitely instantiated and called to call the function my_function(). If a call to my_function_type is made, the compiler can always deduce the function called at compile time (in contrast to run-time function pointer, no type erasure happens here).

Using this formalism, for example, a generic swap method can be written (this example also contains a bit more complex control flow in the generic):

program main
  implicit none
  integer :: i,j
  real :: a,b
  read (*,*) a, b, i, j
  call swap (integer)(i,j)
  call swap (real)(a,b)
  print *,a,b,c,d
contains
  generic swap(t)
    subroutine_type:: swap
    type:: t

    if (t == integer) then 
      swap = integer_swap
    else
      swap = general_swap
    end if

    contains
      subroutine general_swap(a, b)
        type(t), intent(inout):: a, b
        type(t) :: tmp
        tmp = a
        a = b
        b = tmp
       end subroutine general_swap

      subroutine integer_swap(a, b)
        integer, intent(inout):: a, b
        a = ieor(a,b)
        b = ieor(b,a)
        a = ieor(a,b)
      end subroutine integer_swap
  end generic
end program main

where there are two different functions to chose from: a general version and an integer version (which uses xor swapping) which are then invoked in the main program. What would then be compiled would be something close to

program main
  implicit none
  integer :: i,j
  real :: a,b
  read (*,*) a, b, i, j
  call integer_swap (i,j)
  call general_swap_real(a,b)
  print *,a,b,i,j
contains
  subroutine general_swap_real (a, b)
    real, intent(inout):: a, b
    real :: tmp
    tmp = a
    a = b
    b = tmp
 end subroutine general_swap_real

  subroutine integer_swap(a, b)
    integer, intent(inout):: a, b
    a = ieor(a,b)
    b = ieor(b,a)
    a = ieor(a,b)
  end subroutine integer_swap
end program main

Another more complex example: Generating nested DO loops which apply a subroutine to elements, by calling subroutines for each level. This could look like

program main
  call manydoos (print_arr, 3)
contains
  generic manydoos(f, i)
    subroutine_type:: f
    integer:: i
    manydoos = res
  contains
    subroutine res()
      call manydoos_impl(f, i, i)([])
    end subroutine
  end generic

  generic manydoos_impl (f, i, total)
    subroutine_type:: manydoos_impl
    subroutine_type:: f
    integer:: i
    integer:: total

    if (i == 0) then
      manydoos_impl = body
    else
      manydoos_impl = loop
    end if
  contains
    subroutine body(indices)
      integer, dimension(total), intent(in):: indices
      call f (indices)
    end subroutine

    subroutine loop(indices)
      integer, dimension(total - i), intent(in):: indices
      integer:: i
      do i=1,10
        manydoos_impl (f, i-1, total)([indices, i]) 
      end do
    end subroutine
  end generic manydoos_impl

  subroutine print_arr(a)
    integer, intent(in), dimension(:):: a
    print *, a
  end subroutine

end program main

which would then generate code like

  subroutine manydoos_3_res
    call manydoos_impl_3_loop ([integer::])
  end subroutine manydoos_3_res

  subroutine manydoos_impl_3_loop (indices)
    integer, dimension(0), intent(in) :: indices
    integer :: i
    do i=1,10
       call manydoos_impl_2_loop ([indices, i])
    end do
  end subroutine manydoos_impl_3_loop

  subroutine manydoos_impl_2_loop (indices)
    integer, dimension(1), intent(in) :: indices
    integer :: i
    do i=1,10
       call manydoos_impl_1_loop ([indices, i])
    end do
  end subroutine manydoos_impl_2_loop

  subroutine manydoos_impl_1_loop (indices)
    integer, dimension(2), intent(in) :: indices
    integer :: i
    do i=0,10
       call manydoos_impl_0_loop ([indices, i])
    end do
  end subroutine manydoos_impl_1_loop

  subroutine manydoos_impl_0_loop (indices)
    integer, dimension(3), intent(in) :: indices
    call print_arr(indices)
  end subroutine manydoos_impl_0_loop

  subroutine print_arr(a)
    integer, intent(in), dimension(:):: a
    print *, a
  end subroutine print_arr
end program main

It is expected that compilers would inline this kind of procedure efficiently.

Another example: Calculating a single Newton step, where the function and its derivative are passed into the function.

program main
  real :: x0, x1
  read (*,*) x0
  x1 = newton_step (real, f, df, real)(x0, -2)
contains
  generic newton_step(t, f, df, param_type)
    function_type:: newton_step
    function_type:: f, df
    type:: param_type, t

    newton_step = r
  contains
    function r(x, p)
      type(t) :: r
      type(t), intent(in):: x
      type(param_type), intent(in):: p
      r = x - f(x, p)/df(x, p) 
    end function
  end generic

  function f(x, a)
    real:: f
    real, intent(in):: x, a
    f = x*x +a
  end function f
  function df(x, a)
    real:: df
    real, intent(in):: x, a
    df = 2*x
  end function df
end program main

This would then be translated to something like

program main
  real :: x0, x1
  read (*,*) x0
  x1 = newton_step_r (x0, -2)
  print *,x1
contains
  function newton_step_r (x, p)
    real :: step_r
    real, intent(in) :: x
    real, intent(in) :: p
    r = x - f(x, p)/df(x,p)
  end function newton_step_r
  function f(x, a)
    real:: f
    real, intent(in):: x, a
    f = x*x +a
  end function f
  function df(x, a)
    real:: df
    real, intent(in):: x, a
    df = 2*x
  end function df

end program main

To allow usage of the powerful methods devoloped and used in functional programming, it could also be possible pass generics as arguments to other generic functions. An simpler example of its use could be a scalar product like

generic sp (acc,t)
  function_type :: dp
  generic_type :: acc
  type :: t
  sp = f
contains
  function f(a,b) result(res)
    type(t), intent(in), dimension(:) :: a,b
    type(acc(t)) :: res
    integer :: i
    do i=1, size(a)
      res = res + a(i) * b(i)
    end do
  end function f
end generic

where the user could specify a function which returned a suitable accumulator for the type t.

certik commented 1 year ago

It seems that semantically, this is quite similar to the current proposed generics for F202Y, the main difference is that your proposal allows to write functions that operate on types and can compute and return types at compile time. It seems all your examples could be implemented using the existing generics design in LFortran, except the functions computing the type (we would need to work on this). If you want, we can meet to discuss all the details.

tkoenig1 commented 1 year ago

This proposal does indeed cover the most of the same ground that the generics proposal does, but the syntax is more Fortran-like. Things like requitements, instead of needing new syntax, can just be formulated as IF statements, for example. What would still be missing would then be a COMPILE_TIME_ERROR statement or similar with an error message.

If it is possible, with the current proposal, to do a chain of subroutine calls, I have not found it; it may be possible, but it is at least not intuitive from the proposals.

The main advantage is that the need for template metaprogramming is mostly eliminiated, because it is possible to just program :-)

Regarding a telco: sure. The main part of the proposal was actually written by someone else (also a gfortran contributor) who would then also attend.

My mail address is tkoenig@netcologne.de; you can drop me an e-mail to arrange the telco.

certik commented 1 year ago

In the example above:

program main
  real :: x0, x1
  read (*,*) x0
  x1 = newton_step (real, f, df, real)(x0, -2)
contains
  generic newton_step(t, f, df, param_type)
    function_type:: newton_step
    function_type:: f, df
    type:: param_type, t

    newton_step = r
  contains
    function r(x, p)
      type(t) :: r
      type(t), intent(in):: x
      type(param_type), intent(in):: p
      r = x - f(x, p)/df(x, p) 
    end function
  end generic

  function f(x, a)
    real:: f
    real, intent(in):: x, a
    f = x*x +a
  end function f
  function df(x, a)
    real:: df
    real, intent(in):: x, a
    df = 2*x
  end function df
end program main

The main drawback against our current proposal is that this code does not impose any restrictions on the generic type parameters param_type, f and df. That corresponds to "weak concepts" in C++. As a consequence, when this potentially large generic library code gets used in the user program, at the line:

  x1 = newton_step (real, f, df, real)(x0, -2)

together with the definitions of the functions f and df, the compiler has to go into the library code and ensure that every operation can in fact be done and if there is any kind of an error, you will get a potentially complicated error message coming from somewhere deep in the library code, that the user doesn't want to know about. As a library developer, the compiler thus cannot check that your code does not have type errors.

The advantage of using restrictions is that the compiler does not need to do that, it will give an error message immediately at the call site if anything is wrong. So the programmer gets the same compiler support for writing templates (generic code) as for non-generic code. If the templated code compiles, it is guaranteed to not have any type mismatch errors.

FortranFan commented 1 year ago

If you want, we can meet to discuss all the details.

@certik, @tkoenig1

That is a great suggestion by @certik. What may be worthwhile for @tkoenig1 and the other gfortran contributor to have a meeting with J3 Generics subgroup where @tclune and @everythingfunctional can give a quick presentation on all the work thus far and how that J3 group has arrived at their current design. Instead of some groups working independently coming up with certain designs that might end up overlooking some fundamentally basic tenets that otherwise quite a few now seem to accept as important - e.g., requirements, it will be better to get on the same page.

I have a strong concern and worry with the proposal in the original post - I dislike it - for being so weak with the requirements aspect, additionally I do not see it as a Fortran-like either, rather to me it comes across as more like procedural programming a la FORTRAN via its IF syntax. To me, it is grossly inadequate for the compact, abstract approaches I prefer for codes using the current Fortran standard including 2023.

tclune commented 1 year ago

I fully grant that the proposed syntax is not compact, and we do expect to introduce additional syntax for common, simple cases in subsequent versions of the standard.

I too (originally!) wanted to be able to have a generic/template for just types rather than our current proposal which is more of a generic module. The problem with that original notion is that it does not play well with Fortran type-bound procedures, and that is something that subgroup wanted to permit. And as it is, we still had to limit some important ways in which TBP's cannot be used.

One of our most important audiences for the proposed features is that of library developers, as opposed to end users. I.e., we are trying to make certain that the hard cases can be handled and in a very robust manner.
To that end, we are heavily leveraging the expertise of an outside expert that has explained to us the pitfalls of approaches that have been used in other languages. In particular, "strong requirements" are seen as ensuring that robust templates can be developed with confidence.

I expect that everyone will be disappointed to varying degrees with the initial release. I myself, am frustrated that I cannot produce containers of polymorphic entities in this first go around without manually wrappers. At the same time, I am quite confident that the new features will enable the vast majority of the use cases that were identified by subgroup in the early stages.

I do strongly disagree with the statement that the proposed syntax comes across as "more like procedural programming". I just don't see that.

FortranFan commented 1 year ago

I do strongly disagree with the statement that the proposed syntax comes across as "more like procedural programming".

Whatever is decided, one can only hope for Fortran the generic SWAP will not require IF statements like so, or any decision flow semantics using any switches (select type, etc.) whatever:

  generic swap(t)
    subroutine_type:: swap
    type:: t

    if (t == integer) then 
      swap = integer_swap
    else
      swap = general_swap
    end if

    contains
      subroutine general_swap(a, b)
        type(t), intent(inout):: a, b
        type(t) :: tmp
        tmp = a
        a = b
        b = tmp
       end subroutine general_swap

      subroutine integer_swap(a, b)
        integer, intent(inout):: a, b
        a = ieor(a,b)
        b = ieor(b,a)
        a = ieor(a,b)
      end subroutine integer_swap
  end generic

This will be rather poor design.

Additionally the original proposal does not address the canonical challenge when it comes to generic SWAP in Fortran - what about the case when the received arguments a, b have ALLOCATABLE / POINTER attributes?

certik commented 1 year ago

There are two orthogonal aspects:

generics
metaprogramming

The F202Y generics proposal is strictly generics, no metaprogramming. The proposal in this issue is mostly metaprogramming, similar to the Zig language: https://ziglang.org/documentation/master/#Generic-Data-Structures. You can use metaprogramming to write generic code. But you don't get the compiler checks for library code that the F202Y proposal will deliver, as @tclune mentioned above. We can consider metaprogramming for Fortran, but I think we should do it after delivering strongly typed generics ("strong concepts"), and building it on top.

tkoenig1 commented 1 year ago

I do strongly disagree with the statement that the proposed syntax comes across as "more like procedural programming".

Whatever is decided, one can only hope for Fortran the generic SWAP will not require IF statements like so, or any decision flow semantics using any switches (select type, etc.) whatever:

Good thing that this is not required in the proposal, then.

The code was meant as an example of what could be done, not what should be done. (And xor swap is a bit silly, anyway, as you may know :-)

Additionally the original proposal does not address the canonical challenge when it comes to generic SWAP in Fortran - what about the case when the received arguments a, b have ALLOCATABLE / POINTER attributes?

Standard Fortran syntax would apply.

tkoenig1 commented 1 year ago

One of our most important audiences for the proposed features is that of library developers, as opposed to end users. I.e., we are trying to make certain that the hard cases can be handled and in a very robust manner.

That sounds like a very reasonable goal (although I would not like to lose sight of the end user).

To that end, we are heavily leveraging the expertise of an outside expert that has explained to us the pitfalls of approaches that have been used in other languages. In particular, "strong requirements" are seen as ensuring that robust templates can be developed with confidence.

Are there resources available about this external consultant found, and what pitfalls to avoid? I would dearly like to read that (and it is hard to argue about something one does not have access to :-)

everythingfunctional commented 1 year ago

Are there resources available about this external consultant found,

https://j3-fortran.org/doc/year/19/19-188.pdf

tclune commented 1 year ago

For reference, here is what a SWAP template would like like with the currently planned generic syntax for a specified rank. Separate work may allow us to use assumed rank, but for now assume that is not usable in this context.

Note 1: A higher level template that uses the one below to overload for all ranks 0-15 for a type might be sensible in some contexts.

Note 2: This version does not handle POINTER/ALLOCATABLE objects in an efficient manner - requires too much copying. Better would be variants that swap pointers in the POINTER case and use MOVE_ALLOC in the ALLOCATABLE case. Fortran disambiguation unfortunately does not let us use the same procedure name for all 3 cases within the same scope.


MODULE M
   TEMPLATE  SWAP_T(T, N)
      TYPE, DEFERRED :: T
      INTEGER, CONSTANT :: N
      ! Generic name to support overloading
      INTERFACE swap
           procedure :: swap_
      END INTERFACE
   CONTAINS

      SUBROUTINE swap_(a, b)
            TYPE(T), RANK(N), INTENT(INOUT), :: a, b
            TYPE(T) :: tmp

            tmp = a
            a = b
            b = tmp
      END SUBROUTINE swap_
   END TEMPLATE
END MODULE

PROGRAM MAIN
   USE MODULE m

   INSTANTIATE swap_T(integer, 3) 
   INSTANTIATE swap_T(real,0)
   INSTANTIATE swap_T(myType, 0)

   integer :: i1(5,7,8), i2(5,7,8)
   real :: x, y
   type :: myType
        ...
   end type
   type(myType) :: t1, t2

   ...
   call swap(i1,i2)
   call swap(x,y)
   call swap(t1, t2)

END PROGRAM

tkoenig1 commented 1 year ago

OK, I see the point about restrictions on types.

I have a few ideas of how this can be quite naturally added to the proposal. Give me a day or two.

tkoenig1 commented 1 year ago

Here is an example of the kind of syntax for type restrictions that my co-worker and myself would envisage. I have interspersed explanations with example code.

The main concept is that of a typeclass. Typeclasses impose requirements on both on templates and on users of a template. If the template does not satisfy a requirement, the template is rejected, hopefully with a good error message. If, when instantiating, a requirement is not met, there is also a helpful error message, like "required operator(+) not found in typeclass foo ..."

Because typeclasses just offer restrictions, it is possible to combine them in a type.

program main
  implicit none

  typeclass Equal(T)
    interface function operator (==) (a, b) result(equal)
      type(T), intent(in):: a, b
      logical:: equal 
    end interface
  end typeclass

This defines an typeclass with an interface function, which has to have the == operator defined, returning a LOGICAL. Instead of an operator, this could also be a function or a subroutine.

  typeclass HasIntrinsicFirst(T)
    requirement
      type(T):: tt
      component(tt%first)
      requires (is_intrinsic(tt%first))
    end requirement
  end typeclass

A typeclass which prescribes that each instance of T has a component tt%first. Also, the function is_intrinsic needs to evaluate to .true. in any instantiation of T. This is a also a generic function, which can be either user-specified or, in this case, perferably be an intrinsic function.

  type MyType
  instance Equal
    procedure :: equal => equal_mytype
  end instance    

  intance(HasIntrinsicFirst)
  integer :: first, second
  end type

A type that fulfills the restriction of both MyType and HasIntrinsicFirst. The procedure for equality comparision is specified below.

contains
  function equal_mytype(a, b) result(r)
    type(MyType), intent(in):: a, b
    logical:: r
    r = a%first == b%first .and. a%second == b%second
  end function

The function to compare for equality for MyType.

  ! Implement the unix uniq-command for an array - remove adjacent duplicates
  generic uniq(T)
    typeclass(Equal):: T ! in the generic, we can only rely on the functions
                         ! provided by the typeclass Equal
    function_type:: uniq
    uniq = uniq_impl

A generic which works on Equal; it's an easy one with only one version.

  contains
    subroutine uniq_impl(a)
      type(T), dimension(:), intent(inout):: a
      integer:: i, j
      j = 1
      do i=1,size(a)
        if (.not. a(i) == a(j)) then ! This can be called here, as T has to be of 
                               ! typeclass Equal
          j = j+1
          a(j) = a(i)
        end if
      end do
    end subroutine
  end generic
end program

It is also possible to combine typeclasses in another typeclasses (just adding more restrictions).

The language should provide intrinsic typeclasses which cover the intrinsic types. Examples could be strictly_ordered (for characters or integers, where .not. (a < b) always implies a >=b), weakly_ordered for reals; both would require a full set of comparison operators.

Intrinsic functions should also include something like is_pointer and is_allocatable.

everythingfunctional commented 1 year ago

@tkoenig1, some of the concepts in your proposal are good, if not the exact syntax and design, and we may be able to incorporate them into future updates to our current plan. You can see where we're at currently by reading through the papers at the J3 site. You can find the early drafts to know what to look for at the subgroups Github site here.

tclune commented 1 year ago

@tkoenig1 Some aspects of your proposal align fairly naturally with our existing REQUIREMENT and REQUIRES features. These can handle cases where the type must satisfy certain interfaces (e.g., T==T returns logical).

One important item on our list which is similar to some of the capabilities you describe would be to declare type T to be extensible. Currently our templates cannot do CLASS(T) because if T turns out to be an intrinsic, the template would not work. Our outside expert advocates that rather than decorating the template type parameter with some attribute that would restrict it to the non-intrinsic case, to instead decorate the type on the INSTANTIATE side. This would require the compiler to construct the appropriate wrapper that would work within the template and the appropriate accessors to hide this from the user. It get's messy, and I doubt we would ever get it done that way. I see adding a new attribute as a more practical path even if it does step away from full generality.

When you start getting into requirements on specific components and type-bound procedures of a type, you really end up restricting the generality of a template. My favorite in this category is that you want to design a solver that works for any type that has an eval() method that returns a real. Great. Now, your colleague has a type that spells it as evaluate() and cant use your template. And another user hascompute()`. This is even worse with data components, I think.

Maybe we'll allow something like that in subsequent revisions. We'll certainly look into it. But for now, we just expect users to write a wrapper procedure for their TBPs, and our existing template mechanism should handle things. Any such wrapper would suffice for any other template that also needs access to that TBP, so there would not be the problem of writing another set of wrappers each time one encountered another template.

One can readily make the argument that our existing design does not allow a lot of existing code to be readily incorporated into templates. OTOH, one could argue that we are forcing coders to make some important design decisions and write templates that can inherently be used in a wider context. Albeit with a bit more effort on the part of the client code.

certik commented 1 year ago

@tkoenig1 with the restrictions in as you just proposed this is now close enough to the existing proposals that I recommend you meet with @tclune and @everythingfunctional directly by joining their biweekly calls. The differences are now a lot of little details, @tclune mentioned some, and there is more. The big one is metaprogramming that your proposal has, but the existing ones do not, but as I wrote above, I recommend to build metaprogramming on top of generics.

Overall, this is great progress, thanks @tkoenig1 for all these ideas. Furthermore, LFortran master has a preliminary implementation of generics, so you can use it to play with them, see the template_*.f90 series of tests here: https://github.com/lfortran/lfortran/blob/01b73167d83002c7851b511ceaa54104870bc5d4/integration_tests/template_add.f90, they all fully compile and run.

tclune commented 1 year ago

We always welcome more voices in the meetings, but ... be aware that at this point we are dealing with nitty gritty syntax. And this is quite boring.

Also, any attempt to introduce substantial new features or redesign the overall approach will risk that this feature will not make it into the 202Y standard. Granted that we are moving faster now, but it took a long time to get the existing specification papers passed. Subgroup is generally focused on marshaling our existing approach through the process with an eye towards extensions/relaxations in the subsequent standard. (203Z?)

certik commented 1 year ago

@tclune Yes, definitely do not get derailed! The ball will be in @tkoenig1's court to help and see how his ideas can be incorporated and he can meet with @everythingfunctional and others separately to make progress. For sure the main focus of your meetings should be to make progress on your existing path.

tkoenig1 commented 1 year ago

When you start getting into requirements on specific components and type-bound procedures of a type, you really end up restricting the generality of a template. My favorite in this category is that you want to design a solver that works for any type that has an eval() method that returns a real. Great. Now, your colleague has a type that spells it as evaluate() and cant use your template. And another user hascompute()`. This is even worse with data components, I think.

For that, I have a proposal: Allow renaming on instantiation. There is precedence in the language for this in the USE statements, and in template instantiation itself. Syntax is illustrative and in the spirit of the current papers.

   REQUIREMENT R(T)
      TYPE, DEFERRED :: T
      REAL :: T%X
   END REQUIREMENT

  TEMPLATE

could be fulfilled by something like (template itself elided)

  TYPE FOO
    REAL A
    REAL B
 END TYPE FOO

  INSTANTIATE (TEMPL(FOO), FOO%A => FOO%X) => TEMPL_FOO

plevold commented 1 year ago

I like this proposal. That shouldn't be a surprise as I've previously given similar but by no means as detailed feedback on the current proposal.

Personally I like the convention seen in other programming languages of using a separate pair of brackets for generic arguments better than using regular parentheses and special keywords as it clearly highlights what is known at compile time and what must be handled at runtime. Take C++ variants for example:

if (std::holds_alternative<Foo>(v)) {
    // ...
}

Using <Foo> makes it easy to spot the template instantiation which is performed at compiler time and not runtime. < and > is of course most common and adopting this in Fortran as well would provide some familiarity for experienced programmers that is new to Fortran. Go use [ and ] which also could be an option.

The problem with that original notion is that it does not play well with Fortran type-bound procedures, and that is something that subgroup wanted to permit. And as it is, we still had to limit some important ways in which TBP's cannot be used.

@tclune can you please elaborate? Because C++ literally handles this with no problems at all.

When you start getting into requirements on specific components and type-bound procedures of a type, you really end up restricting the generality of a template. My favorite in this category is that you want to design a solver that works for any type that has an eval() method that returns a real. Great. Now, your colleague has a type that spells it as evaluate() and cant use your template. And another user hascompute()`. This is even worse with data components, I think.

This should be the left to be handled by a interface or trait system. To support an interface (or trait) with a procedure eval() for a type which names it evaluate() one would simply implement an eval() procedure which forwards to evaluate(). This can be trivially inlined by the compiler afterwards.

We always welcome more voices in the meetings, but ... be aware that at this point we are dealing with nitty gritty syntax. And this is quite boring.

Also, any attempt to introduce substantial new features or redesign the overall approach will risk that this feature will not make it into the 202Y standard. Granted that we are moving faster now, but it took a long time to get the existing specification papers passed. Subgroup is generally focused on marshaling our existing approach through the process with an eye towards extensions/relaxations in the subsequent standard. (203Z?)

I interpret this as that the working group don't really want to make changes to their design. If so, why bother asking for feedback from the community at all?

If you really want to listen to listen to the community before implementing new language features you need to be prepared to make changes. At least I don't see any way to make the working group proposal compatible with alternatives like this.

certik commented 1 year ago

I interpret this as that the working group don't really want to make changes to their design. If so, why bother asking for feedback from the community at all?

The working group has to make changes if the design is not good enough. I will personally facilitate that. However, it is also important to not block the effort and risk of not delivering anything (unless the current design is so bad that it should not be done at all). So the best way forward is for you @plevold to keep pushing on this better design (in your opinion). Keep organizing the discussion and please invite me too. I can't help if I am not invited. :) This proposal by @tkoenig1 shows what it takes to make progress, one has to make detailed well thought out proposal, and then one has to meet and discuss with others. It's a lot of work. Everybody is doing what they can, given their little time. Conclusion: we want criticism. Ideally such criticism that can be turned into a better proposal than what is currently there.

tclune commented 1 year ago

First - my apologies for coming across as defensive. Lots of pressure coming at me from different directions in my day job, and it is often difficult to respond to suggestions about generics without spending significant time. There are many valid approaches and the existing papers do not go as far as I would prefer.

Earlier feedback is easier to incorporate. Concrete template examples implemented with "suggested" syntax have appeared on our GitHub site as well as in earlier J3 papers. We did receive some feedback on those examples, and revised the planned syntax accordingly. With the current syntax paper, we are struggling to even be sure that it says what we meant, and making nontrivial changes is quite difficult at least for the moment. But of course making such changes will happen if we collectively decide that the current path is inadequate. And even if the current syntax paper passes, it is not set in stone.

If subgroup can be convinced that a change is worth incorporating, we can go back with a revised syntax paper. I lack wisdom, but I would think we have 2-3 meeting in which we can let the proverbial dust settle. At some point we must start on the edits papers and if the syntax is not established and well accepted, such papers will be a waste of time. And ... if we can't keep on that schedule, then the feature will not make it into F202Y.

With regard to using variant brackets instead of parens: I'm not strongly opposed, but I'm certain that there would be fierce resistance by the committee. First and foremost, with our suggested syntax there is no potential confusion with existing uses of parens, so there is not any ambiguity that would generally drive us in the direction of using angle brackets or similar. This is as opposed to coarrays where introducing square brackets directly addresses ambiguity in the case of scalars and whole arrays and helps with clarity even in the case of slices and array element references on other images.

In a future revision, we intend to go back and identify ways to simplify the use of generics in simple/common cases. It could be that some sort of "inline" analog of INSTANTIATE would arise in that analysis, and at that point the use of angle brackets (ala STL) would have a much stronger argument.

tclune commented 1 year ago

@tclune can you please elaborate? Because C++ literally handles this with no problems at all.

Not entirely sure what I was intending there. Possibly just the subsequent paragraph about lack of generality. But maybe I was thinking about this issue: Basically a C++ class includes the procedures that are its methods. In Fortran a derived type has a reference to procedures but the procedures themselves are not part of the derived type. This means that if we put parameters on a derived type, getting those parameters to be "seen" in the implementation of the procedure is problematic. This is already a major weakness of existing Fortran parameterized types.

But to be fair, subgroup is mostly focused on the lack-of-generality issue.

This should be the left to be handled by a interface or trait system. To support an interface (or trait) with a procedure eval() for a type which names it evaluate() one would simply implement an eval() procedure which forwards to evaluate(). This can be trivially inlined by the compiler afterwards.

The first problem with this approach is that one may have to write multiple such wrappers and in the case of a name conflict may even end up with an interface that can't be disambiguated from an existing use case. But the larger problem is that one often does not have the ability to modify the type which is intended to be passed to a template. (Either permissions issues or even because the type is provided as part of a library.) Such issues prevent introducing new TBPs to fit the required template interface, but do not at all prevent writing simple procedure wrappers.

Again we intend to look at relaxing the rules for TPBs and components in subsequent revisions. But we are fairly committed to not allowing it in the first release.

tclune commented 1 year ago

I realized after I submitted the last comment, that type-extension could probably handle most where the source code for a type is not available for modification.

I think there are some remaining, much rarer, cases that will limit generality of templates that require types to have a particular TBP, and if I have time later I'll try to add that to this thread.

But the short answer remains that subgroup wants to first see how much of a problem this will be with the currently planned syntax. It is much easier to relax the constraints later than to forbid something that was once permitted.

certik commented 1 year ago

Thank you @tclune for leading the generics, you've been doing a great job in making steady progress. Thanks for your comments above, I agree with them. @plevold if you want to meet, I am happy to get you up to speed on the various issues that I understand, and we can make progress about how to incorporate your ideas as well. The ball is in your court to organize it.

plevold commented 1 year ago

Not entirely sure what I was intending there. Possibly just the subsequent paragraph about lack of generality. But maybe I was thinking about this issue: Basically a C++ class includes the procedures that are its methods. In Fortran a derived type has a reference to procedures but the procedures themselves are not part of the derived type. This means that if we put parameters on a derived type, getting those parameters to be "seen" in the implementation of the procedure is problematic. This is already a major weakness of existing Fortran parameterized types.

I don't understand your argument. In C++ one would define a method inside the class/struct scope, but in Fortran it is in the accompanying module scope. For example, in C++ you'd do

template<typename T>
class Foo {
    T val;

    void bar() {
        // Implementation here
    }
};

but in Fortran you'd do

! Some syntax for declaring a generic type with argument T here
type foo_t
    type(T) :: val
contains
    procedure :: bar
end type

contains

subroutine bar(this)
    class(foo_t), intent(inout) :: this

    ! Implementation here
end subroutine

And the problem is then how do you tell bar that foo_t has a generic parameter T? Isn't this just a matter of defining the proper rules and/or syntax? The most simplistic - though maybe confusing for a reader - would be to implicitly pass them to the TBP. Another way is to specify T for the this argument which opens up the possibility of adding further requirements to a generic parameter for a specific TBP. This is similar to what Rust does where generic parameters are repeated on each impl block which is separated from the struct / enum declaration.

I spoke virtually to @everythingfunctional 3 months ago. Back then my feedback were dismissed based on the claim that generic TBPs would not work in Fortran because TBPs receive the type as a runtime polymorphic argument (class(foo_t)). Due to inheritance it would not be possible to know how to access a type's fields or other TBPs. As I demonstrate here however, C++ is able to dispatch at runtime to the correct virtual method when invoked inside a generic method. As far as I can tell, this should be directly comparable to the situation in Fortran.

I find it somewhat disappointing that feedback still is turned down based on the claim that generics and TBPs won't work, but with no convincing explanation.

Again we intend to look at relaxing the rules for TPBs and components in subsequent revisions. But we are fairly committed to not allowing it in the first release.

Not expanding the current scope for the first revision is absolutely fair, but I firmly believe it will be very difficult to incorporate proper generic types and generic TBPs in future revisions due to the use of the template scope and instantiate statements. I explored this in my previously referenced at fortran-lang and in an example here (sorry about the limited use of comments here, it was intended as an example for the virtual meeting).

This is the core of my criticism of the current proposal. If somebody could make a compelling argument that the current proposal could lead to a generics mechanism in Fortran which is as powerful as the one seen in e.g. Rust then I'd be happy to see it incorporated in the language.

You make an interesting point in comment https://github.com/j3-fortran/fortran_proposals/issues/293#issuecomment-1419675572 about that the current proposal is essentially a generic module. I very much agree with this analysis and I think pursuing it as exactly that would make much more sense than introducing the keywords template and instantiate. For instance, the example given here - https://github.com/lfortran/lfortran/blob/01b73167d83002c7851b511ceaa54104870bc5d4/integration_tests/template_add.f90 - could rather be something along these lines:

module add_m(T, F)         !<-- Generic arguments at the module scope
    implicit none
    private
    public :: add_t

    requirement R(T, F) 
        type :: T; end type
        function F(x, y) result(z)
            type(T), intent(in) :: x, y
            type(T) :: z
        end function
    end requirement

 contains

   function add_generic(x, y) result(z)
        type(T), intent(in) :: x, y
        type(T) :: z
        z = F(x, y)
    end function
end module

program generic_add
    use add_m(real, func_arg_real), only add_real => add_generic  !<-- Instantiation of generic module in use statement
    use add_m(integer, func_add_int), only: add_integer => add_generic

    real :: x, y
    integer :: a, b
    x = 5.1
    y = 7.2
    print*, "The result is ", add_real(x, y)
    if (abs(add_real(x, y) - 12.3) > 1e-5) error stop

    a = 5
    b = 9
    print*, "The result is ", add_integer(a, b)
    if (add_integer(a, b) /= 14) error stop
contains
    real function func_arg_real(x, y) result(z)
        real, intent(in) :: x, y
        z = x + y
    end function

    integer function func_arg_int(x, y) result(z)
        integer, intent(in) :: x, y
        z = x + y
    end function
end program

I think this would to a much larger extent keep the door open for future expansion to generic types, generic procedures and generic TBPs, for example like the proposal here made by @tkoenig1.

Note that I think the example syntax I used above with generic arguments in parentheses with potentially cause ambiguities if used for generic subroutines (they can have no runtime arguments) and for generic types (it would clash with parametric types).

@plevold if you want to meet, I am happy to get you up to speed on the various issues that I understand, and we can make progress about how to incorporate your ideas as well. The ball is in your court to organize it.

@certik honestly I'd be reluctant to set up yet another meeting. I'm starting to feel that I've already wasted enough time on this topic with no one in the working group being interested in listening.

tclune commented 1 year ago

Just back from travel, so don't have time for a thorough response. But to your first point:

! Some syntax for declaring a generic type with argument T here

type foo_t
    type(T) :: val
contains
    procedure :: bar
end type

contains

subroutine bar(this)
    class(foo_t), intent(inout) :: this

    ! Implementation here
end subroutine

You may (often!) want the procedure bar to also use type(T) and that is the issue: T is not available in the scope of bar().

Note, I'm not claiming that something could not be made to work in that direction. I'm just saying that it is nontrivial from a scoping perspective.

plevold commented 1 year ago

@tclune please take a look at how Rust does this (reference in my previous post). A Fortran equivalent would be to declare bar as follows (random generic syntax using <> assumed):

subroutine bar<T>(this)
    class(foo_t<T>), intent(inout) :: this
end subroutine

tclune commented 1 year ago

Yes - as I said something along those lines is possible, but there are several more steps. You need to complete the link between the TPB procedure in foo_t and this procedure. In particular, you would need to specify which parameter will be passed as the <T> in your latest declaration of bar().

plevold commented 1 year ago

My previous example was somewhat inaccurate. <T> after the subroutine name would indicate a generic function in Rust, not a regular function on a generic type. This would be more analogous:

subroutine bar(this)
    class(foo_t<T>), intent(inout) :: this
end subroutine

How exactly one choose to implement this however irrelevant. It is a problem solved long time ago by C++, Java, Go, Rust and others. My point is simply this: Please don't go ahead with template and instantiate so that we can have nice things like this in future revisions!

everythingfunctional commented 1 year ago

@plevold , if it seems subgroup has been ignoring you, I apologise. We don't intend to. We are all volunteers, some of us paying for the privilege, and doing our best.

As for your suggested implementation, @tclune has already demonstrated, to an extent, the difficulties associated with it; namely that the template parameters are not in scope elsewhere. But lets explore your examples.

template<typename T>
class Foo {
    T val;

    void bar() {
        // Implementation here
    }
};

This is actually feasible with our current design. I.e.

template foo_tmpl(T)
  type, deferred :: T

  type :: foo
    type(T) :: val
  contains
    procedure :: bar
  end type
contains
  subroutine bar(self)
     class(foo) :: self
     ! implementation here
  end subroutine
end template

I admit it takes more lines, but it is equivalent. A thing you've asked for previously would be more like:

template<typename T>
class Foo {
    T val;

    template<typename U>
    U bar() {
        // Implementation here
    }
};

...

Foo<int> a;
float b = a.bar<float>(); // I wonder if this will work, or how ugly the error message will be

There are two issues with this example. First, this is two templates, instantiated in two separate places, one after the other. This in and of itself is not a problem for our design. The other, which our design actually exposes, is that it is not specified what kind of relationship is required between T and U; that is dictated by the // implementation here and left unsaid. Let's look at the equivalent example in our design.

template foo_tmpl(T)
  type, deferred :: T

  type :: foo
    type(T) :: val
  end type

  template bar_tmpl(U, t_to_u)
    type, deferred :: U
    interface
      type(U) function t_to_u(x)
        type(T), intent(in) :: x
      end function
    end interface
  contains
    type(U) function bar(the_foo)
      class(foo) :: the_foo
      ! implementation here, but about the only thing it can do is bar = t_to_u(the_foo%val)
    end function
  end template
end template

...

instantiate foo_tmpl(integer), only: foo, bar_tmpl
instantiate bar_tmpl(real, real), only: bar ! Note the intrinsic function real can be used directly here

type(foo) :: a
real :: b

a = ...
b = bar(a)

Note the only shortcoming here is that you have to say bar(a) instead of a%bar(). I have suggested we could propose universal procedure call syntax to make the latter allowed syntactic sugar for the former. So it is again more verbose, but doable.

So far I have not seen a use case (realistic, "real world" problem) which cannot be solved by our design. If you can supply one we will happily explore ways of either making it work with our design or changing to a different design. "Language X can do it this way" is not sufficient. Sometimes different languages use different syntax or organisation for similar things.

tkoenig1 commented 1 year ago

There is one point I would like to raise - templates vs. metaprogramming.

C++ gets a lot of power from its Turing-complete metaprogramming features. They had templates, and they extended templates with metaprogramming, but the template feature had been designed without metaprogramming in mind, so they tacked it on.

The result is extremely powerful, but due to its add-on nature, full of quirks and gotchas, and code which exploits that feature to any large extent is beyond the average programmer to write, let alone to read or even debug. As language features go, this might reasonably be called a disaster.

Conceptually, metaprogramming and templates go hand in hand, so it is logical to have one solution for both. But C++ showed that it would be far better to have have an upgrade path from templates to template metaprogramming already even if templates only are introduced in a first revision.

And this is where we (my co-contributor and myself) are struggling a bit, to find a viable upgrade path from templates to template metaprogramming within what is specified in the current proposal - it would probably have to be added as separate syntax.

And we think that template metaprogramming is very much in use for Fortran already. People just use all sorts of non-portable preprocessors for it because the language does not offer anything in that regard.

So, questions:

Do people agree that that template metaprogramming would be a good thing to have in general?

Is the approach of using a Fortran-like syntax outlined in the original proposal something that people think is a good idea?

If so, is there interest in developing a syntax which would facilitate something along the lines of the original proposal, and which would offer the same functionality of what is currently in the scope for F202y?

(As a side remark, I have no fear that J3 will add incomprehensible features like C++ did, I just fear the result will be less good than it could have been).

tclune commented 1 year ago

For the moment our design is purposefully not permitting meta-programming. Ideally this means that if meta-programming is later added, it will be a separate/clean set of features. No guarantees of course.

Investigating use cases for meta-programming is probably a reasonable thing to add to the list of 203x features, though 203y is probably a bit more realistic, given that 203x will likely have a great deal of focus on relaxing some constraints on generics that will be in 202y. E.g., allowing something like CLASS(T) in a template, simpler syntax for simple use cases, providing a mechanism to put template subprograms elsewhere (analog to submodules), etc.

certik commented 1 year ago

@tkoenig1 this is an excellent comment and I agree with your line of thinking. I don't know the answer to your questions, though (I can see pros and cons). But at least I am happy that we arrived to this point, that we understand the landscape, that your proposal is templates + metaprogramming, while the design lead by @tclune is without metaprogramming.

FortranFan commented 1 year ago

@tkoenig1 asks :

Do people agree that that template metaprogramming would be a good thing to have in general?

In principle yes, but it appears a bridge too far in Fortran.

Note for me metaprogramming in Fortran will be of little to no interest unless it were to also offer the capability to move some computations from run-time to compile-time.

Consider a rather trivial case around unit-of-measure conversions that were used by Dos Reis and Stroutsrup to illustrate the use cases toward constexpr that eventually made into C++11 and where compile-time computation is performed:

#include <iostream>
using namespace std;

constexpr double best_func_ever(double x) {
   return x*1.8 + 32.;
}

int main() {
    constexpr const double foo = best_func_ever(-40.);
    constexpr const double bar = best_func_ever(100.);

    cout << foo << ", " << bar << endl;
    return 0;
}

C:\temp>gfortran c++.cpp -lstdc++ -o c++.exe

C:\temp>c++.exe
-40, 212

Note the authors write, "This paper presents a framework for generalizing the notion of constant expressions in modern system programming languages. It extends compile time evaluation to functions and variables of user-defined types, thereby including formerly ad hoc notions of Read Only Memory (ROM) objects into a general and type safe framework. It allows a programmer to specify that an operation must be evaluated at compile time. Furthermore, it provides more direct support for key meta programming and generative programming techniques."

Now, a detour: bear in mind constant expression functions are of great interest to the peers I work with in industry and constexpr facility in C++ is employed extensively in many aspects of scientific and technical computing performed by them. They have fed back to me many a times they would like to see CONSTEXPR FUNCTIONs in Fortran. When I had reported to them on the new feature of SIMPLE subprograms in Fortran 2023, the first thing I had heard back was why not CONSTEXPR. So why is this of interest: because in the actual practice of science and especially engineering, there are a lot of simple functions involving particular formulae that prove very efficient in calculations if they can be computed at compile-time rather than at run-time.

Since then I have been trying to see if the J3 committee will be open to CONSTEXPR functions as a worklist item for Fortran 202Y. The response was the subgroup "is opposed to this feature." Earlier on the same topic, the comments were, "JoR would like to see a compelling use case. Seems expensive to implement if initialization happens at compile time. Until JoR determines this is easier than we think to implement, this feature will remain in the Not Recommended catagory." What is clear to me is almost nothing is ever a compelling use case in Fortran, almost everything is too "expensive" and difficult.

Thus the question to ask is when something like a CONSTEXPR function is too expensive and difficult, how are the broader metaprogramming features going to be introduced in Fortran?

Mind you, Fortran has long allowed constant expressions so one can do:

..
   real, parameter :: foo = -40*1.8 + 32.0
   real, parameter :: bar = 100*1.8 + 32.0
..

but the language won't allow an avoidance of code duplication and will not permit:

   ..
   real, parameter :: foo = best_func_ever( -40. )
   real, parameter :: bar = best_func_ever( 100. )
   ..
contains
   ..
   constexpr function best_func_ever( x ) result(r)
      real, intent(in) :: x
      r = x*1.8 + 32.  !<-- compile-time computable
   end function

So again, if everything is expensive and difficult in Fortran, why bother with template metaprogramming when some of the key aspects that provide direct support for the same, as explained by Dos Reis and Stroutsrup, are beyond reach?