j3-fortran / fortran_proposals

Proposals for the Fortran Standard Committee
174 stars 14 forks source link

Create a testsuite for the standard #57

Open certik opened 4 years ago

certik commented 4 years ago

Currently each compiler must develop its own tests for every feature in the standard. One could imagine, down the road, that the committee can maintain a "blessed" set of tests for each feature in the standard, which if the compiler passes, then the feature can be considered "implemented".

One can then maintain an automatic matrix of features and compilers to see which compilers implement which features.

Such a testsuite for each feature can be a nice complement to the standard, giving an example for all the corner cases how a certain feature should behave.

This might seem like a lot of work to do from scratch, but we can start doing it for every new feature from now on. And eventually implement the tests for old features as time allows. Doing it for new features would not be as hard, since the committee spends a lot of time designing each feature carefully. So writing tests for it might even make the process easier.

FortranFan commented 4 years ago

@certik this will be tremendously helpful.

Do you think GitHub can be used for this?

And do you think it will be possible to get a "critical mass" of support from the Fortran committee to engage and contribute to such an effort? For example, to review and confirm the standard-conformance of each of the tests in a suite because such a "blessing" is what will validate the test suite.

certik commented 4 years ago

@FortranFan Yes, GitHub or GitLab can be used. Regarding convincing the committee, I think as we keep discussing and proposing the various ideas in this repository, I think it will become clear in the future which ideas are very popular, and then we should all work towards convincing the committee to do those.

klausler commented 4 years ago

A thorough suite of positive and negative tests for every feature, requirement, and constraint in the standard would be incredibly valuable to implementations, as well as to commercial customers requiring proof of conformance.

sblionel commented 4 years ago

For FORTRAN 77, there was an official ANSI test suite which vendors would pay the US government annually to come in, run, and certify the results. The suite was rather simplistic overall and never changed. ANSI/NIST discontinued it.

There have been several attempts to create a more comprehensive test suite for newer revisions. The Hendrickson/Spackman SHAPE95 suite was licensed by multiple vendors, and it was good, but work stopped on it. While I was at Intel I became aware of an Italian user, whose name escapes me, who appeared to be creating a useful test suite covering all the syntax, but maybe not the full semantics. I have no idea what happened to that as we stopped hearing from him (bug reports) after a while.

This is a far bigger task than you might imagine, and requires serious, ongoing resources. It is not something the Fortran committee, all volunteers taking time from their day jobs, can develop or even specify/manage. As a former vendor, I can tell you that vendors would be happy to pay for a good test suite with ongoing maintenance and support. But who is going to create it? These vendors (and I'll include freeware developers in this) all have their own test suites, largely comprised of collected applications, unit and regression tests; but they are far from comprehensive, no matter how good intentioned.

Writing a good test is much harder than you might think. Compiler developers are the wrong people to write tests, as they will test what they think the feature is supposed to do. For a while, Intel had a dedicated team, separate from the compiler group, writing tests based on the documentation; it worked very well and uncovered many bugs, but the team, funded by a different organization, was disbanded and we lost that resource.

The big problem as I see it is that the market for such a test suite is small compared to the cost of creating and maintaining it. The only way I could see this happening is if some government funded the effort, as they did in the F77 days, and I don't see that happening today, especially for Fortran.

So, yes, it would be fantastic if a good, modern Fortran test suite existed and was maintained. Who is going to do it?

certik commented 4 years ago

@sblionel Thanks a lot for your feedback, I really appreciate it. My approach is to first figure out ideas that the community would like to see. This seems to be one of the more popular ideas.

Once we have established that we want to get this done, we can shift the discussion into how to get it done. I can see several approaches, and there might be more ways to get it done:

So it might be possible to secure and facilitate the funding. The next question is who will do that. Again, there are several approaches here:

Somebody would need to assemble a team and lead this effort and try to get it funded. This comment can serve as some ideas how to do so.

FortranFan commented 4 years ago

@certik wrote:

.. My approach is to first figure out ideas that the community would like to see. ..

@certik, there might be some people like me who are restricted in terms of what they can do with GitHub (e.g., basic use via browser ok, but not much else) but who can contribute some unit tests for consideration toward inclusion in the "modern Fortran testsuite".

If it's possible for you or others on such a team on the "standard" testsuite to set up some kind of infrastructure for such a suite (perhaps on GitHub itself?) and some mechanism for community submissions which can then be reviewed for correctness/suitability, etc. particularly in terms of standard conformance (or lack thereof as intended failures) and approved for inclusion in the suite, that might be another resource this team can start drawing upon.

Over the years, I've assembled quite a few "informal" and small unit tests on a variety of standard features including OO facilities, parameterized derived types, UDDTIO, coarrays, interoperability with C, pure and elemental subprograms, submodules, block constructs, etc. which I can start to "clean up" and share with the team working on the "standard testsuite". And there might be other community members willing to contribute similarly.

To show an example, here's a case I've been grappling with the last couple of weeks: I think the code is conforming per my read of current standard, however it fails with 2 processors I tried which then makes me wonder if the compilers have bugs in their implementations or whether I'm in the wrong. If it is the former i.e., compiler bugs, then this might be a possible case for inclusion in the "modern Fortran testsuite":

#include <stdio.h>
#include <assert.h>
#include "ISO_Fortran_binding.h"

void Csub(const CFI_cdesc_t *, size_t);

void Csub(const CFI_cdesc_t * dv, size_t locd) {

   CFI_index_t lb[1];
   lb[0] = dv->dim[0].lower_bound;
   size_t ld = (size_t)CFI_address(dv, lb);

   printf("In C function: CFI_address of dv = %lx\n", ld);
   assert( ld == locd );
   return;

}
! Unit Test #: Test-1.F2018-2.7.5
! Author     : FortranFan
! Reference  : The New Features of Fortran 2018, John Reid, August 2, 2018
!              ISO/IEC JTC1/SC22/WG5 N2161
! Description:
! Test item 2.7.5 Fortran subscripting
! void *CFI_address(const CFI_cdesc_t *dv, const CFI_index_t subscripts[]);
! that returns the C address of a scalar or of an element of an array using
! Fortran sub-scripting.
!

   use, intrinsic :: iso_c_binding, only: c_int, c_size_t, c_loc

   implicit none

   integer, parameter :: LB_A = -2
   integer, parameter :: UB_A = 1
   character(len=*), parameter :: fmtg = "(*(g0,1x))"
   character(len=*), parameter :: fmth = "(g0,1x,z0)"

   blk1: block

      interface
         subroutine Csub(a, loc_a_1) bind(C, name="Csub")
            import :: c_size_t
            type(*), intent(in) :: a(:)
            integer(c_size_t), intent(in), value :: loc_a_1
         end subroutine
      end interface

      integer(c_int), target :: a( LB_A:UB_A )
      integer(c_size_t) :: loc_a

      print fmtg, "Block 1"

      loc_a = transfer( c_loc(a(lbound(a,dim=1))), mold=loc_a )
      print fmth, "Address of a: ", loc_a

      call Csub(a, loc_a)

      print *

   end block blk1

   blk2: block

      interface
         subroutine Csub(a, loc_a_1) bind(C, name="Csub")
            import :: c_int, c_size_t
            integer(kind=c_int), allocatable, intent(in) :: a(:)
            integer(c_size_t), intent(in), value :: loc_a_1
         end subroutine
      end interface

      integer(c_int), allocatable, target :: a(:)
      integer(c_size_t) :: loc_a

      print fmtg, "Block 2"

      allocate( a( LB_A:UB_A ) )
      loc_a = transfer( c_loc(a(lbound(a,dim=1))), mold=loc_a )
      print fmth, "Address of a: ", loc_a

      call Csub(a, loc_a)

      print *

   end block blk2

end

Here're the compilation and linking steps used using MinGW gfortran along with program execution which shows the failure:


C:\Temp>type c.c
#include <stdio.h>
#include <assert.h>
#include "ISO_Fortran_binding.h"

void Csub(const CFI_cdesc_t *, size_t);

void Csub(const CFI_cdesc_t * dv, size_t locd) {

   CFI_index_t lb[1];
   lb[0] = dv->dim[0].lower_bound;
   size_t ld = (size_t)CFI_address(dv, lb);

   printf("In C function: CFI_address of dv = %I64x\n", ld);
   assert( ld == locd );
   return;

}

C:\Temp>x86_64-w64-mingw32-gfortran.exe -c -Wall -Wextra c.c -o c.o

C:\Temp>type p.f90
! Unit Test #: Test-1.F2018-2.7.5
! Author     : FortranFan
! Reference  : The New Features of Fortran 2018, John Reid, August 2, 2018
!              ISO/IEC JTC1/SC22/WG5 N2161
! Description:
! Test item 2.7.5 Fortran subscripting
! void *CFI_address(const CFI_cdesc_t *dv, const CFI_index_t subscripts[]);
! that returns the C address of a scalar or of an element of an array using
! Fortran sub-scripting.
!

   use, intrinsic :: iso_c_binding, only: c_int, c_size_t, c_loc

   implicit none

   integer, parameter :: LB_A = -2
   integer, parameter :: UB_A = 1
   character(len=*), parameter :: fmtg = "(*(g0,1x))"
   character(len=*), parameter :: fmth = "(g0,1x,z0)"

   blk1: block

      interface
         subroutine Csub(a, loc_a_1) bind(C, name="Csub")
            import :: c_size_t
            type(*), intent(in) :: a(:)
            integer(c_size_t), intent(in), value :: loc_a_1
         end subroutine
      end interface

      integer(c_int), target :: a( LB_A:UB_A )
      integer(c_size_t) :: loc_a

      print fmtg, "Block 1"

      loc_a = transfer( c_loc(a(lbound(a,dim=1))), mold=loc_a )
      print fmth, "Address of a: ", loc_a

      call Csub(a, loc_a)

      print *

   end block blk1

   blk2: block

      interface
         subroutine Csub(a, loc_a_1) bind(C, name="Csub")
            import :: c_int, c_size_t
            integer(kind=c_int), allocatable, intent(in) :: a(:)
            integer(c_size_t), intent(in), value :: loc_a_1
         end subroutine
      end interface

      integer(c_int), allocatable, target :: a(:)
      integer(c_size_t) :: loc_a

      print fmtg, "Block 2"

      allocate( a( LB_A:UB_A ) )
      loc_a = transfer( c_loc(a(lbound(a,dim=1))), mold=loc_a )
      print fmth, "Address of a: ", loc_a

      call Csub(a, loc_a)

      print *

   end block blk2

end

C:\Temp>x86_64-w64-mingw32-gfortran.exe -c -std=f2018 -Wall -Wextra p.f90 -o p.o

C:\Temp>x86_64-w64-mingw32-gfortran.exe c.o p.o -o p.exe

C:\Temp>p.exe
Block 1
Address of a:  87FE10
In C function: CFI_address of dv = 87fe10

Block 2
Address of a:  A3930
In C function: CFI_address of dv = a3928
A s s e r t i o n   f a i l e d !

 P r o g r a m :   C : \ T e m p \ p . e x e
 F i l e :   c . c ,   L i n e   1 4

 E x p r e s s i o n :   l d   = =   l o c d

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

Program received signal SIGABRT: Process abort signal.

Backtrace for this error:
#0  0xffffffff
#1  0xffffffff
#2  0xffffffff
#3  0xffffffff
#4  0xffffffff

C:\Temp>
certik commented 4 years ago

@FortranFan I will setup the infrastructure very soon and I will update this issue when I do so.

Here is my plan so far:

The testsuite would be separate and independent of any compiler. It would contain metadata about what feature is being tested, which files, and any other useful info, and then it would also contain (probably Python) library showing how to load all the tests and process them.

There can be lots of backends, either as part of the test suite, or separate, for things like autogenerated CMake build system to test any compiler, or to generate a nice website with the coverage for each compiler, etc.

Then compilers, such as LFortran, can have scripts that take this and process the files and create an LFortran specific testsuite from it, for example I'd like to generate parts of this test file.

Other compilers could do something similar. For example for Flang the files in https://github.com/flang-compiler/f18/tree/fdb351ca2afb0d71028785da4687113343e11f54/test/semantics could be generated from such a testsuite. One downside is that the Flang testsuite has a compiler dependent checks for error messages, so it might not be possible to generate such tests from a compiler independent test suite.

One worry that I have is that if the testsuite files change (for whatever reason), then when Flang or LFortran specific testsuite gets regenerated, it will break (the test won't pass anymore). Or to formulate it in another way --- to what extent can a compiler testsuite be "outsourced" to a compiler independent testsuite? Can it even work? I don't have the answer.

If it cannot be done, then the answer is that each compiler still has to maintain its own testsuite, and the compiler independent "standard" testsuite is then used to automatically check what features are being implemented by each compiler, and also each compiler can at least take the standard testsuite as a starting point and then adapt it to its needs.

@klausler what are your thoughts on this?

klausler commented 4 years ago

I think that the tests themselves are what's important, not the test suite infrastructure around them, so I would avoid getting too tightly integrated with tools like Python that change incompatibly over the years. Test cases that can be run manually through a compiler without knowledge of infrastructure are the most useful.

In practical terms, I suggest making a distinction between three kinds of tests:

  1. Tests meant to compile and execute successfully if a particular requirement/constraint/feature is correctly and completely implemented.
  2. Tests meant to compile successfully but abort during execution as they violate some constraint or "shall" clause; execution to completion means that the implementation has failed the test.
  3. Tests meant to not compile successfully, with errors being indicated at/near some locations in their source code.

Tests in the first group should be as self-contained as possible and execute in such a way that it's easy to determine from the shell or other tools that they terminated happily; perhaps via STOP 'PASS'.

Tests in the second group should terminate with an expected error code at runtime; if they run to completion, they should indicate to the shell or other tools that they failed to detect a required error, perhaps via STOP 'FAIL'.

Tests in the third group should have their source code marked with comments describing expected error messages. Testing this group against a particular compiler may be best done by capturing error messages from that compiler, validating them manually, and saving them as "known good" compiler output, deviation from which may indicate errors later. If a compiler fails to catch an error and the program somehow survives to execution, it should crash with a message, perhaps via ERROR STOP 'FAIL'. It would be best if tests in this third group were as small as possible, eliciting a single error message, so that useful results could be obtained just by throwing these cases at a compiler and verifying that it refuses to compile each of them.

certik commented 4 years ago

@klausler thanks for the feedback. What would be some example in the category 2.? Things like sqrt(-1._dp) where the standard says "its value shall be greater than or equal to zero"? That's a great point.

What's your opinion on whether compilers like Flang could (in principle) use such a testsuite, or whether compilers will have to maintain their own testsuite anyway.

klausler commented 4 years ago

It's not either/or. I would love to have access to an authoritative suite of tests for the current revision of the standard, so that I wouldn't have to wade my way through all the ambiguity and imprecision in the text itself in an attempt to write tests for it. An authoritative test suite would reduce the size of f18's future test directories as the standard evolves; and if the committee had to develop or subcontract the development of such tests, perhaps ambiguity would be exposed earlier and fixed sooner. As is, we have access to many open- and closed-source test suites as well as applications, and depend heavily on all of them.

Oh, one last point that's really more important than it might seem: the Fortran standard updates section, requirement, and constraint numbers in incompatible ways with each revision. It's important for f18 compiler and test source code to refer frequently to the standard document. When Fortran 202x arrives, all of those textual citations are going to have to be updated if the standard continues to gratuitously renumber them. The C++ language standard has moved to using names rather than numbers for these referential purposes.

FortranFan commented 4 years ago

@klausler wrote:

I think that the tests themselves are what's important, not the test suite infrastructure around them

I would like to make it abundantly clear when I used the phrase "set up some kind of infrastructure for" a test suite, I only meant something that can help community members submit cases - like the example I showed upthread - for consideration and which can then be reviewed by the team for suitability. It'll be useful to some (basic) indexing scheme to reference the tests (perhaps something that follows a numbering scheme of features in the standard might help?) and preferably some description/comments/results summary to go along with these tests.

klausler commented 4 years ago

Re: numbering of clauses, constraints, and requirements:

In Principles and rules for the structure and drafting of ISO and IEC documents

section 5.6 (page 8):

Consistency should be maintained within each document, and within a series of associated
documents.
• The structure of associated documents and the numbering of their clauses should, as far as
possible, be identical.

If the series of Fortran standards constitute a "series of associated documents", future revisions should avoid needless renumberings of their contents.

sblionel commented 4 years ago

Again, I'll note that the numbering is largely due to ISO rules for standards documents. These are why the chapter numbers changed by three this time around and notes don't have section numbers. Syntax rules and constraints have to be sequentially numbered and there's no feasible way to keep the numbering consistent without freezing the language, as both get additions and deletions over time. Clauses (chapters) are consistent across versions, subject to ISO constraints, and added clauses (for example, C interoperability.)

I have sympathy for code and documentation (including error messages) that want to refer to specific sections of the standard, but I just don't see a way to get there other than qualifying these references with a specific standard and updating them once the implementation fully supports a revision. The DEC/Compaq/Intel compiler had many such references in error messages, but we ended up taking them out.

klausler commented 4 years ago

Syntax rules and constraints have to be sequentially numbered

Why? Because they're assigned by LaTeX and you don't want to change that, or is there an external requirement from INCITS or ISO that they be so?

sblionel commented 4 years ago

You'd need to ask Malcolm Cohen for details, but I think so. This would be ISO, not INCITS. I think it would be chaos if rules and constraints weren't sequential, and what do you do if you want to add new rules in between old ones? Constraints that apply to syntax rules appear in order of the rule (and come before those that don't apply to rules.)

To me, this is like requiring that words in the dictionary always appear on the same page number.

FortranFan commented 4 years ago

Here's a case for consideration toward a test suite for the standard:

! Unit Test #: Test-1.F2018-8.7
! Author     : FortranFan
! Reference  : https://j3-fortran.org/doc/year/18/18-007r1.pdf
!
! Description:
! Section 8.7 IMPLICIT statement in above pdf
! c.f. page 116 NOTE 3:
! Implicit typing is not affected by BLOCK constructs
!

   integer :: x
   x = fn()
   print *, "x = ", x
   if ( x /= 42 ) then
      error stop "FAILURE: expected function return is 42."
   else
      stop "SUCCESS"
   end if
contains
   function fn() result(r)
      integer :: r
      block
         i = 42
      end block
      r = i
   end function
end

One processor I tried works as I expect with this test whereas another doesn't:

C:\Temp>type p.f90


! Unit Test #: Test-1.F2018-8.7
! Author     : FortranFan
! Reference  : https://j3-fortran.org/doc/year/18/18-007r1.pdf
!
! Description:
! Section 8.7 IMPLICIT statement in above pdf
! c.f. page 116 NOTE 3:
! Implicit typing is not affected by BLOCK constructs
!

integer :: x x = fn() print *, "x = ", x if ( x /= 42 ) then error stop "FAILURE: expected function return is 42." else stop "SUCCESS" end if contains function fn() result(r) integer :: r block i = 42 end block r = i end function end


> 
> C:\Temp>gfortran p.f90 -o gnu-p.exe
> 
> C:\Temp>gnu-p.exe
>  x =   -889191990
> ERROR STOP FAILURE: expected function return is 42.
> 
> Error termination. Backtrace:
> 
> Could not print backtrace: libbacktrace could not find executable to open
> #0  0xffffffff
> #1  0xffffffff
> #2  0xffffffff
> #3  0xffffffff
> #4  0xffffffff
> #5  0xffffffff
> #6  0xffffffff
> #7  0xffffffff
> #8  0xffffffff
> #9  0xffffffff
> 
> C:\Temp>ifort p.f90 /exe:ifort-p.exe
> Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.0.5.281 Build 20190815
> Copyright (C) 1985-2019 Intel Corporation.  All rights reserved.
> 
> Microsoft (R) Incremental Linker Version 14.24.28314.0
> Copyright (C) Microsoft Corporation.  All rights reserved.
> 
> -out:ifort-p.exe
> -subsystem:console
> p.obj
> 
> C:\Temp>ifort-p.exe
>  x =           42
> SUCCESS
> 
> C:\Temp>
FortranFan commented 4 years ago

Here's a variant of the case shown in https://github.com/j3-fortran/fortran_proposals/issues/57#issuecomment-578310713 for consideration:

! Unit Test #: Test-2.F2018-8.7
! Author     : FortranFan
! Reference  : https://j3-fortran.org/doc/year/18/18-007r1.pdf
!
! Description:
! Section 8.7 IMPLICIT statement in above pdf
! c.f. page 116 NOTE 3:
! Implicit typing is not affected by BLOCK constructs
!

   integer :: x
   x = fn()
   print *, "x = ", x
   if ( x == 42 ) then 
      error stop "Expected x is some arbitrary processor-dependent value, not 42."
   end if
contains
   function fn() result(r)
      integer :: r
      block
         integer :: i
         i = 42
      end block
      r = i
   end function
end

Both the processors I tried appear to get this case right:

C:\Temp>gfortran p.f90 -o gnu-p.exe

C:\Temp>gnu-p.exe x = -889191990

C:\Temp>ifort p.f90 /exe:ifort-p.exe Intel(R) Visual Fortran Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.0.5.281 Build 20190815 Copyright (C) 1985-2019 Intel Corporation. All rights reserved.

Microsoft (R) Incremental Linker Version 14.24.28314.0 Copyright (C) Microsoft Corporation. All rights reserved.

-out:ifort-p.exe -subsystem:console p.obj

C:\Temp>ifort-p.exe x = 0

C:\Temp>

FortranFan commented 4 years ago

A recent discussion at comp.lang.fortran involving type-bound procedure for a defined assignment and the ELEMENTAL attribute refers to compiler issues. Here's a case which might be useful for a standard test suite - my take is 2 processors currently get this wrong.

module b_m
   type :: b_t
      integer :: i = 0
      logical :: defined_assignment = .false.
   contains
      procedure, pass(lhs) :: assign_b_t
      generic :: assignment(=) => assign_b_t
   end type
contains
   elemental subroutine assign_b_t( lhs, rhs )
      ! Argument list
      class(b_t), intent(inout) :: lhs
      class(b_t), intent(in)    :: rhs
      lhs%i = rhs%i
      lhs%defined_assignment = .true.
   end subroutine
end module

program case1

   use b_m, only : b_t

   type, extends(b_t) :: e_t
   end type

   type :: f_t
      type(e_t) :: e
   end type

   type(f_t) :: foo(2), bar(2)

   bar = foo
   print *, "bar(1)%e%defined_assignment = ", bar(1)%e%defined_assignment, "; expected value is T."
   if ( .not. bar(1)%e%defined_assignment ) error stop "Program did not work as expected."
   stop "SUCCESS"

end program case1

Upon execution of program compiled using gfortran, the run-time behavior is:

bar(1)%e%defined_assignment = F ; expected value is T. ERROR STOP Program did not work as expected.

Error termination. Backtrace:

Could not print backtrace: libbacktrace could not find executable to open

0 0xffffffff

1 0xffffffff

2 0xffffffff

3 0xffffffff

4 0xffffffff

5 0xffffffff

6 0xffffffff

7 0xffffffff

8 0xffffffff

9 0xffffffff

FortranFan commented 4 years ago

Here's an even simpler scenario involving a derived type containing a component which is a derived type with a defined assignment that one processor appears to treat wrongly:

module b_m
   type :: b_t
      integer :: i = 0
      logical :: defined_assignment = .false.
   contains
      procedure, pass(lhs) :: assign_b_t
      generic :: assignment(=) => assign_b_t
   end type
contains
   elemental subroutine assign_b_t( lhs, rhs )
      ! Argument list
      class(b_t), intent(inout) :: lhs
      class(b_t), intent(in)    :: rhs
      lhs%i = rhs%i
      lhs%defined_assignment = .true.
   end subroutine
end module

program case2

   use b_m, only : b_t

   type :: c_t
      type(b_t) :: b
   end type

   type(c_t) :: foo(2), bar(2)

   bar = foo
   print *, "bar(1)%b%defined_assignment = ", bar(1)%b%defined_assignment, "; expected value is T."
   if ( .not. bar(1)%b%defined_assignment ) error stop "Program did not work as expected."
   stop "SUCCESS"

end program case2

The program output unexpectedly is as follows:

bar(1)%b%defined_assignment = F ; expected value is T. Program did not work as expected.

rouson commented 4 years ago

I haven't read this entire thread, but I like the original idea and have been thinking for some time about repurposing the AdHoc repository to automate and democratize the generation of a standards-conformance table. Because AdHoc stores compiler bug reproducers, its build scripts are designed to continue building the remainder of the repository after a compile-time error occurs. This makes it feasible to generate a standards-conformance table based on a test suite that includes code with features not supported by the involved compilers. The build scripts could automatically generated table in GitHub Markdown structured something like the following:

Compiler A Compiler B
Feature A Yes No
Feature B Partial Yes

Each entry would correspond to a unique subdirectory. Contributors would submit pull requests with tests that enable the build/test scripts to set the value of each entry according to the following rules:

For example, the following directory tree would yield the above table if the tests foo.f90, bar.f90, and foobartoo.f90 pass but foobar.f90 fails:

$ tree tests
tests/
├── compiler-a
│   ├── feature-a
│   │   └── foo.f90
│   └── feature-b
│       ├── bar.f90
│       └── foobar.f90
└── compiler-b
    ├── feature-a
    └── feature-b
        └── foobartoo.f90

For free compilers, the tests could be run at no cost using GitHub continuous integration features. For non-free compilers, another mechanism for running the tests might be required. The degree of comprehensiveness of the test suite would be determined by the community.

FortranFan commented 3 years ago

A case for consider in a test suite for the Fortran standard, this one with finalization.

! Unit Test #: Test-1.F2018-7.5.6
! Author     : FortranFan
! Reference  : https://j3-fortran.org/doc/year/18/18-007r1.pdf
!
! Description:
! Section 7.5.6.3 When finalization occurs in above pdf
! c.f. page 80 paragraph starting line 17:
! When finalization occurs
!

module m
   type :: t
      character(len=12) :: name = "default"
   contains
      final :: final_t
   end type
   interface t
      module procedure construct_t
   end interface
contains
   function construct_t( name ) result(r)
      character(len=*), intent(in), optional :: name
      type(t) :: r
      if ( present(name) ) r%name = name
   end function
   subroutine final_t( this )
      type(t), intent(inout) :: this
      print *, "final_t: this%name = ", this%name
      return
   end subroutine
   subroutine sub1()
      type(t), allocatable :: foo
      foo = t( name="constructor" )
      foo%name = "foo"
   end subroutine
   subroutine sub2()
      type(t), allocatable :: foo
      allocate( foo )
      foo = t( name="constructor" )
      foo%name = "foo"
   end subroutine
end module
   blk1: block
      use m, only : sub1
      print *, "Block 1: Two lines from final_t are expected"
      call sub1()
   end block blk1
   print *
   blk2: block
      use m, only : sub2
      print *, "Block 2: Three lines from final_t are expected"
      call sub2()
   end block blk2
end

Consider the program output using gcc version 10.0.1 (experimental)):

C:\temp>gfortran -Wall -std=f2018 p.f90 -o p.exe

C:\temp>p.exe Block 1: Two lines from final_t are expected final_t: this%name = foo

Block 2: Three lines from final_t are expected final_t: this%name = foo

Whereas the output I expect per the Fortran standard is this:

Block 1: Two lines from final_t are expected final_t: this%name = constructor final_t: this%name = foo

Block 2: Three lines from final_t are expected final_t: this%name = default final_t: this%name = constructor final_t: this%name = foo

FortranFan commented 3 years ago

Here's a variant of the previous case, this one involves the expr on the right-hand side to include a component of the variable on the LHS:

! Unit Test #: Test-2.F2018-7.5.6
! Author     : FortranFan
! Reference  : https://j3-fortran.org/doc/year/18/18-007r1.pdf
!
! Description:
! Section 7.5.6.3 When finalization occurs in above pdf
! c.f. page 80 paragraph starting line 17:
! When finalization occurs
!

module m
   type :: t
      character(len=12) :: name = "default"
   contains
      private
      procedure, pass(lhs) :: add_t
      procedure, pass(this) :: clone_t
      generic, public :: operator(+) => add_t
      generic, public :: clone => clone_t
      final :: final_t
   end type
   interface t
      module procedure construct_t
   end interface
contains
   function construct_t( name ) result(r)
      character(len=*), intent(in), optional :: name
      type(t) :: r
      if ( present(name) ) r%name = name
   end function
   subroutine final_t( this )
      type(t), intent(inout) :: this
      print *, "final_t: this%name = ", this%name
      return
   end subroutine
   function add_t( lhs, rhs ) result(r)
      class(t), intent(in) :: lhs
      type(t), intent(in)  :: rhs
      type(t) :: r
      r%name = trim(lhs%name) // "+" // trim(rhs%name)
   end function
   function clone_t( this ) result(r)
      class(t), intent(in) :: this
      type(t) :: r
      r%name = trim(this%name) // "*"
   end function
   subroutine sub()
      type(t), allocatable :: foo
      foo = t( name="constructor" )
      print *, "1"
      foo%name = "foo"
      foo = foo%clone()
      print *, "2"
      foo = foo + foo
      print *, "3"
   end subroutine
end module
   use m, only : sub
   call sub()
end

gfortran program output is

1 2 3 final_t: this%name = foo+foo

The expected program output is

final_t: this%name = constructor 1 final_t: this%name = foo final_t: this%name = foo 2 final_t: this%name = foo final_t: this%name = foo+foo 3 final_t: this%name = foo+foo

certik commented 3 years ago

There is now a project idea for Google Summer of Code (GSoC) 2021 for Fortran-lang to create such a testsuite:

https://github.com/fortran-lang/fortran-lang.org/wiki/GSoC-2021-Project-ideas#standard-conformance-suite

Obviously it would initially be limited to what can be achieved over the summer and then the community can contribute more tests over time. If anyone knows about a student, please direct them to the page to apply.

sigfig commented 3 years ago

as @certik suggested in #200, scalar expressions used in array shape and pdt declarations should probably be included in any conformance test suite, as the related language features are not well supported in any compiler. providing tests for this sort of language feature is pretty tricky because implementations will necessarily include lots of branching depending on the contents of the expression and the surrounding context. sanity checking and codegen bugs can be hidden in eg support for "specification functions" that are very difficult to uncover with unit testing. property-based testing ala haskell's quickcheck is super helpful for this sort of thing, but that would make implementation of a test suite a lot more complicated. i really have no idea what the ideal approach for testing these features should be, but it seems important to include.

certik commented 3 years ago

I think for the array shapes and PDT we simply have to include lots of cases and try to get as many corner cases as we can included. Then as people report bugs in a specific compiler that is not caught by the test suite, we simply add the case in.

FortranFan commented 3 years ago

The following "silly" case shows a memory leak on Windows OS using gfortran but not Intel Fortran per an inhouse proprietary memory checker utility. This silly case is used by me as a quick test followed by a host of other validation steps when a team I work with needs to consider a newer version of Intel Fortran compiler for production use.

Can someone please run Valgrind or some such app on a different platform (Linux/macOS) and please report here if the leak is reproducible? If yes, the community can potentially consider evaluating this case toward inclusion in a test suite for the standard.

! Unit Test #: Test-0.F2018-general
! Author     : FortranFan
! Reference  : https://j3-fortran.org/doc/year/18/18-007r1.pdf
!
! Description:
! A highly contrived but general test using various features of the
! Fortran standard including the use of ALLOCATABLE local objects,
! a derived type with a finalizer toward a component with the POINTER
! attribute, and enhanced interoperability with C to check for
! memory leaks
!

module cstring_m
   use, intrinsic :: iso_c_binding, only : c_size_t, c_ptr
   interface
      function strlen( ps ) result(slen) bind(C, name="strlen")
         import :: c_size_t, c_ptr
         type(c_ptr), intent(in), value :: ps
         integer(c_size_t) :: slen
      end function
   end interface
end module

module foo_m
   use, intrinsic :: iso_c_binding, only : c_char, c_size_t, c_ptr, c_f_pointer
   use cstring_m, only : strlen
   type :: foo_t
      private
      class(*), pointer :: d => null()
   contains
      final :: clean_foo
      procedure :: set, get
   end type
contains
   impure elemental subroutine clean_foo( this )
      type(foo_t), intent(inout) :: this
      if ( associated(this%d) ) then
         deallocate(this%d)
      end if
      this%d => null()
   end subroutine
   subroutine set( this, ps )
      class(foo_t), intent(inout) :: this
      type(c_ptr), intent(in), value :: ps
      integer(c_size_t) :: slen
      slen = strlen(ps)
      if ( slen <= 0 ) error stop
      block
         character(kind=c_char,len=slen), pointer :: cs
         character(kind=c_char,len=:), allocatable :: s
         call c_f_pointer( cptr=ps, fptr=cs )
         s = cs
         cs => null()
         call clean_foo( this )
         allocate( this%d, source=s )
      end block
   end subroutine
   function get( this ) result(r)
      class(foo_t), intent(in) :: this
      class(*), allocatable :: r
      if (.not. associated(this%d)) error stop
      allocate( r, source=this%d )
   end function
end module

module bar_m
   use, intrinsic :: iso_c_binding, only : c_char, c_loc
   use foo_m, only : foo_t
   type :: bar_t
      class(foo_t), allocatable :: foo
   end type
   interface bar_t
      module procedure :: construct_bar
   end interface
contains
   function construct_bar( msg ) result(r)
      character(kind=c_char, len=*), intent(in), target :: msg
      type(bar_t) :: r
      allocate( foo_t :: r%foo )
      call r%foo%set( c_loc(msg) )
   end function
end module

module foobar_m
   use, intrinsic :: iso_c_binding, only : c_char, c_size_t
   interface
#ifndef __GFORTRAN__
      ! gfortran doesn't yet adequately support enhanced interoperabilty with C
      subroutine getmsg_a( str ) bind(C, name="getmsg_a")
         import :: c_char
         character(kind=c_char, len=:), allocatable, intent(out) :: str
      end subroutine
#endif
      subroutine getmsg_p( s, lens ) bind(C, name="getmsg_p")
         import :: c_char, c_size_t
         character(kind=c_char, len=1), intent(inout) :: s(*)
         integer(c_size_t), intent(in), value :: lens
      end subroutine
   end interface
contains
   function msg() result(s)
      character(kind=c_char, len=:), allocatable :: s
#ifdef __GFORTRAN__
      ! Workaround due to gfortran issue with CHARACTER scalar and C interop
      allocate( character(kind=c_char, len=14) :: s )
      block
         integer(c_size_t) :: lens
         lens = int( len(s), kind=kind(lens) )
         call getmsg_p( s, lens )
      end block
#else
      call getmsg_a( s )
#endif
   end function
end module

   block
      use, intrinsic :: iso_c_binding, only : c_char
      use bar_m, only : bar_t
      use foobar_m, only : msg
      class(bar_t), allocatable :: bar
      class(*), allocatable :: c
      bar = bar_t( msg() )
      c = bar%foo%get()
      select type ( s => c )
         type is ( character(len=*) )
            print *, s
      end select
   end block
end
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
#include "ISO_Fortran_binding.h"

const char msg[] = "Hello World!";

// Silly function for a Fortran test
void getmsg_a( CFI_cdesc_t *str ) {
   size_t lens = strlen(msg);
   int irc = CFI_allocate(str, (CFI_index_t *)0, (CFI_index_t *)0, lens);
   if (irc == 0) {
      memcpy(str->base_addr, msg, lens);
   }
}

// Silly function for a Fortran test
void getmsg_p( char *s, size_t lens  ) {
   memcpy(s, msg, lens);
}

Expected program behavior:

C:\temp>cl /c /W3 /EHsc c.c Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29337 for x64 Copyright (C) Microsoft Corporation. All rights reserved.

c.c

C:\temp>ifort /c /standard-semantics /fpp /warn:all /stand:f18 p.f90 Intel(R) Fortran Intel(R) 64 Compiler Classic for applications running on Intel(R) 64, Version 2021.2.0 Build 20210228_000000 Copyright (C) 1985-2021 Intel Corporation. All rights reserved.

C:\temp>link p.obj c.obj /subsystem:console /out:p.exe Microsoft (R) Incremental Linker Version 14.28.29337.0 Copyright (C) Microsoft Corporation. All rights reserved.

C:\temp>p.exe Hello World!

C:\temp>gfortran -c -Wall c.c

C:\dev\Fortran\temp\sor>gfortran -c -Wall c.c

C:\dev\Fortran\temp\sor>gfortran -c -cpp -std=f2018 -Wall -Wno-maybe-uninitialized p.f90

C:\dev\Fortran\temp\sor>gfortran -o gcc-p.exe p.o c.o

C:\temp>gcc-p.exe Hello World!