Open qkoziol opened 1 year ago
From @rhc54:
" 'prterun —help' includes the following:
--stop-in-app Direct the specified processes to stop at an application-controlled location
We need to let you specify an argument that includes the string ID of the place to stop, so I need to do a little work there. Basically, the code OMPI needs to implement to let you stop in a designated place is the same as what is in ompi/runtime/ompi_rte.c starting at line 1115. You check for the OMPI_BREAKPOINT envar and check that against the ID of the point where you want to stop, and then check to see if that is the string that was provided. If so, then you generate an event with that breakpoint string so PRRTE knows you are ready, and wait for the debugger to attach.
I’ll work on the PRRTE side of things as time permits.
Once the PRRTE coding is finished, this capability needs to be tested and documented.
@samuelkgutierrez - Is this something you could work on also?
Hello, @qkoziol. I don't know how much help I would be. I'm unfamiliar with this particular feature, unfortunately.
As it requires code in OMPI to use it, I'm not sure how much value there is in documenting it. So far as I know, there is no corresponding OMPI code at this time
As it requires code in OMPI to use it, I'm not sure how much value there is in documenting it. So far as I know, there is no corresponding OMPI code at this time
I believe that the code is in OMPI already. If it turns out not to be true, it would definitely have to be done before I would document it. :-)
@jsquyres - Is the code in place?
The code is in place already, check ompi_rte_breakpoint
in ompi/runtime/ompi_rte.c
Sorry for not being clear - what I was trying to say was that the "soft breakpoint" only works if someone adds code specific for that breakpoint to OMPI. In other words, if I want to define a new "foo" breakpoint, then I have to add code to OMPI that implements it. Then that code can be activated via PMIx.
So I don't know if there is much value in including it in user-facing documentation. Certainly makes sense for developer docs, though. Not sure which you are writing, but I thought it was user-facing?
OMPI currently has code for the breakpoint in MPI_Init. That's a good example of how to do it.
The "Debugging Open MPI Parallel Applications" documentation section needs to be updated for the capability described in this commit: https://github.com/open-mpi/ompi/commit/f97d081cf9b540c5a79e00aecee17b25e8c123ad