[cuda-24-0.local:08642] 2 more processes have sent help message help-mpi-btl-openib.txt / no active ports found
[cuda-24-0.local:08642] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages
slurmstepd: Job 1171978 exceeded memory limit (18202456 > 16384000), being killed
slurmstepd: JOB 1171978 ON cuda-24-0 CANCELLED AT 2018-10-16T13:24:27
slurmstepd: Exceeded step memory limit at some point.
builds and runs -- currently stuck because
[cuda-24-0.local:08642] 2 more processes have sent help message help-mpi-btl-openib.txt / no active ports found [cuda-24-0.local:08642] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages slurmstepd: Job 1171978 exceeded memory limit (18202456 > 16384000), being killed slurmstepd: JOB 1171978 ON cuda-24-0 CANCELLED AT 2018-10-16T13:24:27 slurmstepd: Exceeded step memory limit at some point.
Why using so much memory?