uiuc-hpc / Recorder

Multi-level I/O tracing library
Other
44 stars 13 forks source link

Fortran MPI seems not implemented #24

Open sheltongeosx opened 1 year ago

sheltongeosx commented 1 year ago

Running MPI tests with Recorder library preloaded, it looks that the error flag is not set after calling any MPI interface.

wangvsa commented 1 year ago

Recorder does has Fortran wrappers and the error flag is set when the C interface has a return parameter.

sheltongeosx commented 1 year ago

Thank you for your quick response! I did not see the error flags being set in Fortran interfaces. The following is a little test code in Fortran:

PROGRAM Hello_from
IMPLICIT NONE
INCLUDE "mpif.h"
INTEGER:: np,myid,err
CALL MPI_INIT (err) 
CALL MPI_COMM_SIZE (MPI_COMM_WORLD, np, err)
err=2001
write(*,*)"==> Starting err=", err
CALL MPI_COMM_RANK (MPI_COMM_WORLD, myid, err)
WRITE(*,*) "Hello world! from", myid, " of", np, " ending err=", err
CALL MPI_FINALIZE (err) 
END PROGRAM Hello_from

Here is the output from running it with 2 mpi processes, without Recorder preloaded: ==> Starting err= 2001 Hello world! from 0 of 2 ending err= 0 ==> Starting err= 2001 Hello world! from 1 of 2 ending err= 0

Here is with Recorder preloaded: ==> Starting err= 2001 Hello world! from 0 of 2 ending err= 2001 ==> Starting err= 2001 Hello world! from 1 of 2 ending err= 2001

wangvsa commented 1 year ago

@sheltongeosx You are right! I just checked the code and the current implementation simply returns the error code from the C interface but didn't set it for the Fortran wrappers. I will fix this. Thanks for catching this.

wangvsa commented 1 year ago

@sheltongeosx This has been fixed in the latest updates.

sheltongeosx commented 1 year ago

@wangvsa Thank you very much for your work!

Yes testing with the code provided above seems that the issue is gone. But it still has issue with Fortran mpi_bcast() call in my application - error flag is still not set after the call (values are correctly broadcasted though). It is found that it returned from the the line 267 during executing RECORDER_INTERCEPTOR_PROLOGUE macro in file lib/recorder-mpi.c without setting the error flag. Looking it further a bit it returned from the macro at line 163 of file include/recorder.h, where it can tell that the logger was not initialized.
Bye the way, testing fortran mpi_bcast() standalone with Recorder shows no issue. Unfortunately bcast in my app runs into the issue....

wangvsa commented 1 year ago

Recorder is initialized at MPI initialization time. When bcast is called, recorder should already be initialized. Nevertheless, even it was not initialized, I should still set the error flag. Will fix this soon, btw, which application are you running?

sheltongeosx commented 1 year ago

It is Quantum ESPRESSO: https://gitlab.com/QEF/q-e

wangvsa commented 1 year ago

The error flag issue has been solved. But I found an even more problematic issue https://github.com/uiuc-hpc/Recorder/issues/25 when running Quantum ESPRESSO. I may need to check with the MPICH team on this.