heronsystems / adeptRL

Reinforcement learning framework to accelerate research
GNU General Public License v3.0
204 stars 29 forks source link

Initial add of hearthstone #13

Closed SethKitchen closed 6 years ago

SethKitchen commented 6 years ago

builds and runs -- currently stuck because

[cuda-24-0.local:08642] 2 more processes have sent help message help-mpi-btl-openib.txt / no active ports found [cuda-24-0.local:08642] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages slurmstepd: Job 1171978 exceeded memory limit (18202456 > 16384000), being killed slurmstepd: JOB 1171978 ON cuda-24-0 CANCELLED AT 2018-10-16T13:24:27 slurmstepd: Exceeded step memory limit at some point.

Why using so much memory?