Lumi-supercomputer / lumi-userguide

User documentation on the usage of LUMI resources
14 stars 14 forks source link

Add --exclusive and --mem=0 to GPU example jobscripts #125

Open Danzelot opened 1 year ago

Danzelot commented 1 year ago

Many users use the example script on small-g and dev-g without realising that they are not exclusive and have different default memory settings.

klust commented 1 year ago

I'd use --mem=480g instead because that would at the same time protect against memory leaks in the OS as we have already had.

olouant commented 1 year ago

I would comment the --exclusive to avoid problems with users using 1 GPU on small/dev-g and copy-pasting the example without properly check their job script. By commenting, users need to do an edit in order to activate exclusive. We don't want tickets complaining about being billed at a 8x rate because of our examples.

klust commented 1 year ago

In fact, Fredrik wants us to go after such users and is working on tools to detect this. But on the other hand he wants jobs that need 4 nodes or less to make more use of the small partitions to reduce the load on especially standard. So we should make very clear what this example is for: Only to use full nodes on small-g. So I agree with Orian that we should probably at least add a warning in the job script and/or comment the line out.

olouant commented 1 year ago

Ok, then I think a complete rewrite of this page will be better, highlighting how to choose the right partition according to the job requirements with corresponding job script examples. Same for the equivalent page for LUMI-C.