TACC / launcher

A simple utility for executing multiple sequential or multi-threaded applications in a single multi-node batch job
MIT License
63 stars 33 forks source link

Plugin for LSF? #57

Closed AJVincelli closed 4 years ago

AJVincelli commented 4 years ago

Hello, does anybody have a plugin for LSF (not SLURM, PBS, or SGE)? Thanks!

AJVincelli commented 4 years ago

It looks like I can set the LAUNCHER_PPN and LAUNCHER_NHOSTS variables in the job file instead of the plugin, but I'm really having difficulty populating the hostfile. The list of active nodes doesn't seem to be an acceptable input, nor the one-line $LSB_HOSTS file. I get the error "Host key verification failed." I saw something on a Google Search about "handle_host," but I'm not sure about it. I really appreciate anybody's help!

AJVincelli commented 4 years ago

@siliu-tacc If you have any ideas, I would be forever grateful! :-)

AJVincelli commented 4 years ago

Tagging @lwilson just to follow

lwilson commented 4 years ago

@AJVincelli It's been a long time since I used LSF, but if I recall, LSB_MCPU_HOSTS will print a hostname followed by num_procs format:

echo $LSB_MCPU_HOSTS
hostA 4 hostB 4

This might be easier to parse than LSB_HOSTS since each host is listed only once:

export LAUNCHER_NHOSTS=$(( `echo $LSB_MCPU_HOSTS | wc -w` / 2))

Additionally, LSB_MCPU_HOSTS would make populating LAUNCHER_PPN easier if you are assuming that each host has the same number of processes:

export LAUNCHER_PPN=`echo $LSB_MCPU_HOSTS | awk '{print $2}'`
siliu-tacc commented 4 years ago

Hi @AJVincelli,

I do not have any LSF machine by hand and therefore can't set it up for you directly. I think there are at least two easy ways to make it happen:

1) You can just run "hostname" on all involved nodes and keep the output in a file. That will be your new hostfile. 2) I am not 100% sure about the format LSF uses here. I think you may use "$LSB_HOSTS" for convenience (or LSB_MCPU_HOSTS if necessary). Then you can "parse" the 1-line node list form LSF using sed/awk or some Linux commands. To make the single-line list into a file with multiple lines, you can do something like:

$ cat hostfile_old host1 host2 host3 host4 host5 host6

$ sed 's/ /\n/g' < hostfile_old > hostfile_new

$ cat hostfile_new host1 host2 host3 host4 host5 host6

If the format is different or you need any further assistance, please feel free to let us know.

Best wishes, Si

AJVincelli commented 4 years ago

Thank you so much @lwilson and @siliu-tacc! I'm sorry that I'm just seeing your messages now, I really appreciate your help. I will try your suggestions and report back asap. Thanks again!!