Closed huttered40 closed 4 years ago
The memory footprint for 64 nodes was at least: (mn/dc) *64=268435456
(for a single matrix).
Now this is the same memory footprint as that for c=8
on 32 nodes, which makes me think its not running out of memory (although it is close).
Haven't I seen this before? Using more memory per node causes problems with more nodes? I'm not really sure why this makes sense though.
I'm almost certain this is the same exact bug as #23.
For variant
m=524288,n=2048,c=16,ppn=64,tpr=1
on both 64 nodes and 128 nodes, it essentially hangs after a strange error:On 64 nodes, I get the following error statements written to the .e file:
On 128 nodes, it just hangs with no error message.
The corresponding script files are the following:
For 64 nodes:
and for 128 nodes: