Closed PhDyellow closed 2 years ago
Giving BOTH the master and the worker 200GB of RAM successfully built gfbootstrap_combined_tmp
.
The Internet suggests that sacct -s r --format=ALL
might tell me how much memory was used at peak, but MaxRSS seems to be empty.
After killing the jobs with scancel
the MaxRSS
field was populated. It seems the master needed 55GB memory, and the worker needed 85GB of memory.
I suspect the increased memory demands in master may have something to do with the cache files, but I don't know.
I do know that gfbootstrap_combined_tmp
is one of the few targets that pulls in ALL the gfbootstrap objects at once.
I have an estimate of memory consumption now.
I am attempting to figure out how much memory is needed by
gfbootstrap_combined_tmp
in #8. After setting the worker memory to 200GB, and settingclustermq.worker.timeout
to 7 days, the worker seems to be running fine. However, now the master is being killed for going over 20GB.I have checked that
_targets.R
usesstorage = "worker"
andretrieval = "worker"
, so master shouldn't be loading anything.I will try setting master to use 200GB anyway, and see if it works.