flang-compiler / flang

Flang is a Fortran language front-end designed for integration with LLVM.
Other
808 stars 137 forks source link

Memory allocation error due to `stat=` handling. #884

Open NigelYYB opened 4 years ago

NigelYYB commented 4 years ago

I'm working on a benchmark and recently encountered a runtime memory allocation failure that's triggered by this pull request https://github.com/flang-compiler/flang/commit/5d757c03171c91c10c4563301a22c9d4363d45dc

I'm new to FORTRAN and flang. My understanding is that the double allocation check might be wrong.

Function ENTF90(ALLOC04A, alloc04a) checks for double allocation by doing this:

  if (ISPRESENT(stat) && *firsttime && *stat != 2)

I added some extra printf to debug my benchmark error.

void
ENTF90(ALLOC04A, alloc04a)(__NELEM_T *nelem, __INT_T *kind, __INT_T *len,
                         __STAT_T *stat, char **pointer, __POINT_T *offset,
                         __INT_T *firsttime, __NELEM_T *align,
                         DCHAR(errmsg) DCLEN64(errmsg))
{
  ALLHDR();

  if (ISPRESENT(stat) && *firsttime && *stat != 2) {
    *stat = 0;
  } else {
    printf("double allocating %p, %d, %d, %d\n", 
      *pointer, 
      *firsttime, 
      ISPRESENT(stat) ? *stat : "100",  
      I8(__fort_allocated)(*pointer));

    fflush(stdout);
  }
....

Before my benchmark exited with error, it printed double allocating (nil), 1, 2, 0.. This is telling me that if we are first time allocating an empty **pointer, and the stat field happens to be 2, the runtime will consider this double allocation.

I'm no expert in FORTRAN. But this looks strange to me. Would it make more sense to replace *stat != 2 with !(*pointer && I8(__fort_allocated)(*pointer)) ?

Update April 8, 2020 This is reproduce-able on both Ubuntu 18 and Centos 7.

Here are some more lines from my debug print. This looks like stat can be 0, 2, or some seemingly garbage number -1376148457.

double allocating (nil), 0, 0, 0
..
double allocating 0x20010000, 0, -1376148457, 1
double allocating 0x1, 0, -1376148457, 1
double allocating (nil), 1, 2, 0

Update July 24, 2020:

I recently revisited this issue and found that not only using user garbage stat is problematic, the I8(__fort_allocated)(*pointer) query also seems to be broken..

This query is

/** \brief
 * Is array allocated?
 */
int
I8(__fort_allocated)(char *area)
{
  ALLO_HDR *p;

  ALLHDR();

  if (area) {
    return 1;
  }
  return 0;
}

The #define ALLHDR() seems to be mistakenly left out empty. And the whole query does nothing but to check if the pointer area is null or not. This means a garbage pointer (e.g. allocatable pointer passed as param) can be considered fort_allocated if it's happens to be non-zero.

I might have missed something. But I fail to see how flang runtime is able to remember the allocated memory addresses and perform double-allocation check.

NigelYYB commented 4 years ago

@gklimowicz Hey Gary, I'm new to this project and am not sure if this is even a valid problem. Would you please shed some light on this?

gklimowicz commented 4 years ago

@wangzpgi Can you take a look at this for us, please? This is related to some work you did, in particular to handle a separate failure in a test case we and.

gklimowicz commented 4 years ago

We believe we have a fix for this that hasn't been incorporated into Flang yet.

bryanpkc commented 4 years ago

@gklimowicz When will the next update be? It seems that activity on Flang has completely stopped since December.

gklimowicz commented 4 years ago

We're sorting out the testing requirements for pull request processing with the other committers. I anticipate we will be merging pull requests again before the end of the month.

h-vetinari commented 4 years ago

Sorry if this is a dumb question, but why not join in with the f18-turned-flang-within-llvm project? Seems like a pity to duplicate efforts like that.

NigelYYB commented 4 years ago

f18-turned-flang-within-llvm project

Thanks. I wasn't aware of this project. Will join~

gklimowicz commented 4 years ago

There are two "flang" projects (it's a long story). This one(flang-compiler/flang) is the original release of Flang based on the NVIDIA/PGI compiler. It has support for Fortran 2003 and much of Fortran 2008 and OpenMP. It compiles and runs real programs, and is the basis of Arm's current Fortran compiler.

The f18-turned-flang-within-llvm is a new Fortran 2018 front-end. It is not yet a complete implementation, in that it does not completely lower to LLVM IR or have a working runtime.

As we work on f18-turned-flang, we continue to take bug reports and plan to update flang-compiler/flang with bug fixes to support Arm and others who are using this version of the compiler.

Contributions (including bug reports) are welcomed by both!

h-vetinari commented 4 years ago

Thanks for the explanation @gklimowicz! :)

I'd hope that most of the (joint?) effort will go into f18-turned-flang-within-llvm, while the original flang stays in maintenance mode, but that's just my PoV as an outsider (who's hoping for the best possible & complete & fully x-platform & opensource fortran compiler, which seems to have a better/faster chance of happening within llvm).

bryanpkc commented 4 years ago

We believe we have a fix for this that hasn't been incorporated into Flang yet.

@gklimowicz Any update on the fix?

gklimowicz commented 4 years ago

I don't have an update on this one at this time. I have to get us a bit caught up on pull request reviews before I consider merging things from the PGI work stream.

igogo-x86 commented 2 years ago

I also faced the same problem with 5d757c0. Simple reproducer could be made as follows:

program test
    integer, pointer, dimension(:) :: foo
    integer ierr
    ! Even after we comment this out ierr could be equal to 2
    ierr = 2 
    allocate(foo(100), STAT=ierr)
    if (ierr /= 0) then 
        write(*,*) 'ERROR'
    else
        write(*,*) 'SUCCESS'
end program test

Here we make a legal allocation and this allocation happens successfully. But if ierr is equal to 2 before calling allocate then checking status using ierr will mistakenly show an error when there is actually no error. In my case ierr wasn't initialized but undefined value unluckily turned to be 2 and that produced a bug very tricky to determine.

Is there a plan to fix the bug or ideas how it is better to do that?