Open pramodk opened 1 year ago
Hi @pramodk , can you try running one of your deadlocking examples with this environment variable set?
export DARSHAN_LOGHINTS=""
It's been a little while since we've encountered this, but it's possible that the ROMIO driver for the file system has a bug that's only triggered when using the hints that Darshan sets when writing the log file.
For a little more background, Darshan sets "romio_no_indep_rw=true;cb_nodes=4" by default. Taken together, they indicate that regardless of how many ranks the application has, only 4 of them will actually open the Darshan log and act as aggregators. This is helpful at scale to keep cost of opening the log file from getting too high.
Out of curiosity, what does DDT say about the location of the first hang you mention (Intel MPI forcing gpfs ADIO)? Maybe it's failing the collective create of the log file since we don't see any evidence of an output log created?
The 2nd hang (MPT) you mention is clearly hanging the very first time Darshan tries to do collective writes to the log file -- log file creation clearly succeeds as you get the .darshan_partial log. Phil's suggestion has sometimes helped with this sort of thing, so that is worth trying.
(just a quick partial response, will answer other questions tomorrow)
@carns:
can you try running one of your deadlocking examples with this environment variable set? export DARSHAN_LOGHINTS=""
Yes! I confirm that changing romio_no_indep_rw
via DARSHAN_LOGHINTS
run the program successfully i.e.
below fails
export ROMIO_PRINT_HINTS=1
DARSHAN_LOGHINTS="romio_no_indep_rw=true" srun ./hello
...
+ DARSHAN_LOGHINTS=romio_no_indep_rw=true
+ srun ./hello
key = romio_no_indep_rw value = true
key = cb_buffer_size value = 16777216
key = romio_cb_read value = enable
key = romio_cb_write value = enable
key = cb_nodes value = 2
key = romio_cb_pfr value = disable
key = romio_cb_fr_types value = aar
key = romio_cb_fr_alignment value = 1
key = romio_cb_ds_threshold value = 0
key = romio_cb_alltoall value = automatic
key = ind_rd_buffer_size value = 4194304
key = ind_wr_buffer_size value = 524288
key = romio_ds_read value = automatic
key = romio_ds_write value = automatic
key = cb_config_list value = *:1
key = romio_filesystem_type value = GPFS: IBM GPFS
key = romio_aggregator_list value = 0 2
...
...other errors / deadlock...
...
but below succeeds:
export ROMIO_PRINT_HINTS=1
DARSHAN_LOGHINTS="romio_no_indep_rw=false" srun ./hello
srun ./hello
key = romio_no_indep_rw value = false
key = cb_buffer_size value = 16777216
key = romio_cb_read value = automatic
key = romio_cb_write value = automatic
key = cb_nodes value = 2
key = romio_cb_pfr value = disable
key = romio_cb_fr_types value = aar
key = romio_cb_fr_alignment value = 1
key = romio_cb_ds_threshold value = 0
key = romio_cb_alltoall value = automatic
key = ind_rd_buffer_size value = 4194304
key = ind_wr_buffer_size value = 524288
key = romio_ds_read value = automatic
key = romio_ds_write value = automatic
key = cb_config_list value = *:1
key = romio_filesystem_type value = GPFS: IBM GPFS
key = romio_aggregator_list value = 0 2
Wow, thanks for confirming. If you can share the exact MPI library / version you are using when you fill in more details later, that would be great. This is possibly a vendor bug that should be reported. At worst the hint should just be unsupported, not faulty.
If you would like, you can also configure darshan with the --with-log-hints="..."
configure option so that a different default is compiled in (so that the resulting library is safe to use without having to set the explicit environment variable every time).
Probably related to https://github.com/pmodels/mpich/issues/6408
Dear Darshan Team,
I am seeing confusing behavior and I would like to check if I am missing something obvious here. I have seen #559 but not sure if it's the same issue (TBH, I might be wrong as I didn't get time to look into the details):
Here is a quick summary:
produces:
With
ROMIO_PRINT_HINTS=1
, we know that Intel-MPI uses NFS as the default ADIO driver:So, if I force the GPFS driver then the program gets stuck:
As another example, let's look at HPE-MPI (MPT) library:
this also gets stuck! I see that
.darshan_partial
is generated though:I got confused because I have MPI I/O applications that are working fine. For example, in the above test example, let's enable part of the code that just opens a file using MPI/O:
and then
srun ./hello
finishes! 🤔 (at least for the few times I tried)Launching DDT on the exe without
-DENABLE_MPI=1
, the stack trace for 2 ranks looks like below:which appears a bit confusing (?). (By the way, I quickly verified
MPI_File_write_at_all
works with0
ascount
)I didn't spend too much time digging into ROMIO or Darshan code. I thought I should first ask here if this is something looks obvious to the developer team or if you have seen this before.
Thank you in advance!