j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
174 stars 14 forks source link

Unions #188

Open sblionel opened 3 years ago

sblionel commented 3 years ago

For many years, I have heard complaints from users at the lack of unions in Fortran. DEC added unions as part of its STRUCTURE/RECORD extension in 1985, but J3/WG5 never seriously considered adding unions as far as I know. Some considered unions like EQUIVALENCE, a bad programming practice. In an ideal world, I would agree, but there are just too many places where non-Fortran APIs use unions (Windows API for one) and the lack of a way to represent these in Fortran is considered a defect.

Given that a common use of unions is in C-friendly APIs, I want to propose unions in the context of C interoperability. Here is my idea:

Many compilers already support the DEC syntax, so this would be relatively easy to implement.

As a reminder, it would look something like this:

type, bind(C) :: union_type
  union
    map 
      integer : I
    end map
    map
      real :: R
    end map
  end union
end type union_type
type(union_type) :: UR

One could then reference UR%I or UR%R which would share the same memory location. The DEC syntax didn't allow naming unions or maps - I am not sure there is a benefit in doing so.

certik commented 3 years ago

Thanks @sblionel for proposing this. That seems reasonable. Without this feature, what is the (current) way of interfacing a union in some C API?

sblionel commented 3 years ago

The most straightforward way is to use the extension, if your compiler supports it. Otherwise you have to create separate type definitions for the alternate layouts and use TRANSFER to reinterpret the storage - messy.

The text about intrinsic assignment (10.2.1.3p14) will need some additional words on how to handle unions.

FortranFan commented 3 years ago

Feedback from peers as well as requests by Fortranners on other peer-to-peer sites toward Fortran online do indeed indicate adding the facility toward a union-type in Fortran is a really good idea, some can argue it is long overdue.

But all such feedback also indicate a UNION type is a first-class need in Fortran itself, it can be useful in simulations, etc. Therefore, the Fortran facility need not be in the context of interoperability with C only. Rather, the facility in Fortran should have additional characteristics when it comes to interoperability with C. One can see that with the CHARACTER type.

For example, UNION can perhaps be an attribute of a derived type in Fortran with certain restrictions: its components are all either intrinsic and/or other derived types of UNION types, there are no ALLOCATABLE components, nor are there any type-bound procedures allowed nor generic bindings and so forth similar to those with SEQUENCE types (but without any baggage of the SEQUENCE types for sure). The semantics of a type with a UNION attribute will be such that the STORAGE_SIZE of an object of said type is sufficient to contain the largest of the type components.

But that the BIND(C, ..) attribute brings additional semantics into effect toward interoperability with a C companion processor where each component in a union-type needs to be interoperable and the storage aspects then follow those of the companion processor e.g., with respect to any padding considerations.

Regardless, the DEC Fortran extension that includes MAP sections look not only unnecessary but also not modern for standard Fortran. It will be preferable if the Fortran standard excluded anything with MAP that is currently in DEC Fortran.

Also, it should be possible for a derived type to have a component that of a derived type with a UNION attribute regardless of whether the hosting derived type is a UNION type.

Thus a facility can look like so:

type, UNION :: some_union_type  !<-- see the attribute
   integer :: i
   real :: r
end type

type, UNION, BIND(C) :: some_interoperable_union_type
   integer(c_int) :: i
   real(c_float) :: r
end type

type, UNION, BIND(C) :: some_other_interoperable_type
   type(some_interoperable_union_type) :: foo
   character(kind=c_char,len=1) :: bar(N) !<-- where N is a named constant, or a constant expression, etc.
end type
sblionel commented 3 years ago

I'm sorry, @FortranFan , but I am having trouble understanding your suggestion. The DEC syntax provides much the same capability as C, allowing unions to be placed inside any type. It is very popular and well-understood. What is the use case of your version?

I suggested restricting it to interoperable types because that's where the most common use is, and it means you don't have to come up with additional rules for things such as allocatable/pointer/coarray components.

FortranFan commented 3 years ago

@sblionel wrote Nov. 9, 2020 7:43 PM EST:

I'm sorry, @FortranFan , but I am having trouble understanding your suggestion. The DEC syntax provides much the same capability as C, allowing unions to be placed inside any type. It is very popular and well-understood. What is the use case of your version? ..

The use case(s) I am referring to include all the situations in a pure Fortran context, with no involvement of a companion C processor, where Fortranners have a need for a union type.

The scenarios I am referring can include something as simple as

   type, union :: foo_t
      double precision :: x
      character(kind=K, len=N) :: s
   end type

where such a type is consumed using a processor with support toward DOUBLE PRECISION that conforms to Fortran standard but whose interoperability with a companion C processor is of no relevance to the program at hand. And the same with the CHARACTER type whose kind K has no relevance to 'C_CHAR' kind and where the program logic can depend on the length 'N' of a scalar component 's'.

As I wrote in my earlier post, there is a genuine need for a union type in Fortran which is not restricted in any way by the interoperability considerations with a companion C processor.

It's common in simulations (e.g., Physics simulations of particles) to work with "blobs" of data where

But I acknowledge also the importance of a union type in the context of interoperation with a companion C processor, a common scenario being one on Windows OS e.g., where a Fortran program needs to work with user input on a console using Microsoft Windows API ReadConsoleInput.

That's why I suggest the thought process and vision toward such a feature in Fortran be broader and that the development of any proposal strives to include both of the above needs in order to serve the best interests of Fortran practitioners.

sblionel commented 3 years ago

Given that EQUIVALENCE has been booted from the language, I'm doubtful that replacing it with a full-throated, distinct union type would get a warm reception. Nothing in my proposal precludes doing that later, but I'm focused on solving a particular problem in a way that I think would be acceptable to the committee - especially as many compilers already support it.

Type-casting is already possible in Fortran, albeit a bit clumsily, with C_LOC and C_F_POINTER. It seems to me that the use cases you mention can be handled by the proposal at hand.

RobertVanAmerongen commented 3 years ago

With pleasure I read this proposal. I am working with Vulkan and at several places in the specs the union feature is used. Of course, we can circumvent any trouble by using TRANSFER, but I feel this bungly.

I do not understand what is meant with "Fortran 202y Unsubmitted" but I will say: the sooner we have it, the better!

Robert