zhenrong-wang / hpc-now

A Cross-Platform, Multi-Cloud High-Performance Computing Platform
https://www.hpc-now.com
MIT License
235 stars 111 forks source link

Workflow of steps to run hpc-now #42

Open adjebbar opened 5 months ago

adjebbar commented 5 months ago

Hi there.

Please, I want to install the hpc-now in a linux server as an administrator and offer like simple stapes to run an hpc task for non professionnels (remote users). How can I organize the step by step to let the user choose the cluster/num of nodes to use, the price for each choice made by him ? So he can follow his usage and billing before and after the jobs are finished? Could please guide me in this? Regards.

zhenrong-wang commented 5 months ago

Hello @adjebbar,

  1. HPC-NOW provides a command called hpcopr usage to print out or export the usage of clusters. You can use this command to check the cluster configurations, nodes, and durations. But it is at the resource level, not the job level. Because cloud billing is also resource-based, not job-based. That's to say, the moment you initialize a cluster, the cloud billing starts, no matter you run a job or not.
  2. The job-level usage log is stored in the cluster's database managed by the scheduler SLURM. You can use the command hpcopr jobman -u USER_NAME --jcmd list to list out all the jobs running or finished in the cluster. The information of cores used and time duration would be printed out for accounting.
  3. The scaling of your cluster can be vertically (increasing the cpu num of a node) or horizontally(increasing the number of nodes). Therefore, you can use hpcopr addc --nn NUMBER to add some nodes, or use hpcopr delc --nn NUMBER to delete some nodes. Or, you can also turnoff or turnon some nodes. Commands: hpcopr turnonc or hpcopr shutdownc. You can also reconfigure the compute nodes by the command hpcopr reconfc to increase or decrease the cpu of ALL the compute nodes. The target is, NODE_NUM X NODE_CPU = TOTAL_CPU_YOU_WANTED.

Hope the response help.

Zhenrong WANG

adjebbar commented 5 months ago

Hey Wang.

Thanks for your answer. I understand your steps. To make the workflow easy to follow for non professionnel, I want to make like an automatic assistant or bot to assist the user in the submission of job until the end. So he can know his costs. Thanks again Wang.

zhenrong-wang commented 5 months ago

Hi @adjebbar

My pleasure. Please also notice that the most precise billing/costs information is in the cloud billing console. Either the hpcopr usage -b or hpcopr jobman -u USER_NAME --jcmd list only reflects the usage, not the billing dollars.

Best,

Zhenrong WANG