Closed jpratmarti closed 1 year ago
You could can use the --mem
flag in CosmoSIS to help track this down. --mem 2
would print out memory usage every 2 seconds.
One thing to make sure is that OMP_NUM_THREADS is set. I could imagine some systems kicking you out for using too many processes at once, though I've never seen it in practice.
@jpratmarti do you object to closing this issue? It seems to be related to use of CosmoSIS, rather than to firecrown itself.
I don't think this is a cosmosis issue directly - no one has reported this memory issue with other modules. But the mem flag should help track this down for certain.
Closed for lack of feedback.
When running firecrown via cosmosis using just a
test
sampler the computing systemmidway
from the Chicago cluster sends the following message:Connection to [midway2.rcc.uchicago.edu](http://midway2.rcc.uchicago.edu/) closed by remote host.
We believe it is doing that because it is using too much memory, and we get kicked out unless we use an interactive node, but we don't understand why it would need so much memory. This behaviour makes it unpractical to run simple tests on the login nodes.