uiuc-hpc / lci

Implementation of a cool communication layer
Other
14 stars 5 forks source link

lc_finalize in v2 #9

Open omor1 opened 4 years ago

omor1 commented 4 years ago

lc_finalize in the v2 branch is currently a no-op. I'm running into issues where one process exits before others and so I get a SIGPIPE when others try to write to stdout or stderr (I think PMI is forwarding this over a pipe to rank 0 or something).

Also, we should probably just generally clean up when finalizing?

danghvu commented 4 years ago

Right would be good to do it, i took it as low priority since the job is going away anyway so process manager will clean them up(hopefully)

You can add a pmi_barrier, to wait for all to arrive before exit.

On Mon, Mar 2, 2020 at 2:03 PM Omri Mor notifications@github.com wrote:

lc_finalize in the v2 branch is currently a no-op. I'm running into issues where one process exits before others and so I get a SIGPIPE when others try to write to stdout or stderr (I think PMI is forwarding this over a pipe to rank 0 or something).

Also, we should probably just generally clean up when finalizing?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/uiuc-hpc/LC/issues/9?email_source=notifications&email_token=AAIZNSQWNTPSOQF7CKORQBTRFQUMVA5CNFSM4K75R2IKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IR2ZMJQ, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAIZNSWVSVXUBTLQT6LWIZLRFQUMVANCNFSM4K75R2IA .

omor1 commented 4 years ago

LCI (currently) uses the PMI implementation in simple_pmi.c, not the PMIx library, right?

omor1 commented 4 years ago

I'll note that the workaround works though, for obvious reasons. That was the last issue I've seen with PaRSEC/LCI, so it's fully working (on psm2)—so I can now move on to performance testing.