flux-framework / flux-pmix

flux shell plugin to bootstrap openmpi v5+
GNU Lesser General Public License v3.0
2 stars 4 forks source link

flux-shell: unable to format log msg (pmix server %s %.*s): Resource temporarily unavailable #76

Open garlick opened 1 year ago

garlick commented 1 year ago

@vsoch reported this offline

30.412s: flux-shell[1]: stderr: flux-shell: unable to format log msg (pmix server %s %.*s): Resource temporarily unavailable
30.418s: flux-shell[3]: stderr: flux-shell: unable to format log msg (pmix server %s %.*s): Resource temporarily unavailable
30.429s: flux-shell[0]: stderr: flux-shell: unable to format log msg (pmix server %s %.*s): Resource temporarily unavailable
30.430s: flux-shell[2]: stderr: flux-shell: unable to format log msg (pmix server %s %.*s): Resource temporarily unavailable

and mentioned

Update: this seems to happen sometimes but doesn't seem to (obviously) impact the running job.

garlick commented 1 year ago

I confirmed that the error is harmless.

This is pmix protocol tracing and it looks like maybe it's trying to log a string > 4096 bytes. It's somewhat surprising that the shell doesn't silently truncate the string and print what it can with some indication of truncation. I'll open a flux-core bug on that one.

I could imagine maybe some tracing may get pretty verbose on a larger job.

garlick commented 1 year ago

Closing this here since it's actually a flux-core issue.

@vsoch - in case tracing is enabled by default in an operator or other script, it may be a good idea to disable it. It may move a lot of data if used at scale. (It would have been enabled with flux mini CMD -o verbose=2).

vsoch commented 1 year ago

The only similar thing to that we have is for test mode we set verbosity to 0:

-Slog-stderr-level=0

Note that the above was run in test mode so probably not related! Is this an issue we should instead have moved / transferred to flux-core vs closing?

garlick commented 1 year ago

The only similar thing to that we have is for test mode we set verbosity to 0

Oh sheesh, just realized that tracing is forced on in flux-pmix right now, a hold over from early development :facepalm:

I'll go ahead and reopen this issue since there actually is a flux-pmix issue to fix: the fact that tracing is forced on, and the spurious and confusing error you reported. Otherwise, yeah I should have transferred it rather than closing it and opening a new one :facepalm: :facepalm:

vsoch commented 1 year ago

Oh no worries! Transfer isn't widely known so I thought I'd mention it - I use it quite a bit on the singularityhub org because people open issues all over the place!