s417-lama / mpitx

Run MPI programs over tmux
MIT License
11 stars 1 forks source link

MPMD launch mode #1

Open guoyejun opened 1 year ago

guoyejun commented 1 year ago

Hi, it's really a good project that helps me a lot, thank you.

In mpi, there's another launch mode, MPMD (multiple-program-multiple-data), see at https://www.ibm.com/docs/ja/smpi/10.2?topic=command-starting-mpmd-multiple-program-multiple-data-application, the basic syntax of the mpirun command is as follows: mpirun -np num1 prog1 : -np num2 prog2

Will mpitx support this launch mode? thanks.

s417-lama commented 1 year ago

Glad to hear that mpitx helped you. Sorry for the late reply.

It's a good idea to support the MPMD launch mode, but it will complicate the option parsing process. Unfortunately, I currently have no plan to support it, but pull requests are welcome.

guoyejun commented 1 year ago

thanks, let me try it.

one quick question, how to debug it (or print something) after "tmux -L socket-name new-session ...", there's nothing print in the console. thanks.

s417-lama commented 1 year ago

What happens if you type tmux -L socket-name new-session bash?

guoyejun commented 1 year ago

'tmux -L socket-name new-session bash' works as expected, a new tmux session is created with bash, then, I input 'exit' and it turns back to original shell. And I can still see file at /tmp/tmux/socket-name.

actually, I tried the below code to check what's this_cmd, options and commands. I guess my code is something wrong because I only see $this_cmd and "xxxxx" in the log file. And so I want to check the console output for the error message, but don't know how to.

        if not is_inside_tmux():
            socket_name = "mpitx." + str(uuid.uuid4())
            tmux_new_session(socket_name, sys.argv)
            return

        (this_cmd, options, commands) = parse_args(sys.argv)

        print(this_cmd)
        print("\nxxxxx\n")
        print(options)
        print("\nyyyy\n")
        print(commands)
        print("\nzzzzz\n")

        with open("/tmp/mpitx.log","w") as file:
            file.write(this_cmd)
            file.write("\nxxxxx\n")
            file.write(options)
            file.write("\nyyyy\n")
            file.write(commands)
            file.write("\nzzzzz\n")

It is easy to get these three variables from code review, but I'd like to know how to debug mpitx (or print) just in case I need it later.

s417-lama commented 1 year ago

What's happening here is that, if the command is not executed within tmux, mpitx launches tmux and executes the same command within tmux, recursively. Thus, the print functions (e.g., print("\nxxxxx\n")) are executed in a new process within tmux. If the child process exits immediately, the launched tmux window is also immediately destroyed, discarding the output too.

My recommendation is to run mpitx commands within tmux, so that the output will be put in the current shell.

guoyejun commented 1 year ago

please see PR at https://github.com/s417-lama/mpitx/pull/2, thanks.