ludwig-cf / ludwig

A lattice Boltzmann code for complex fluids
https://ludwig.epcc.ed.ac.uk
Other
54 stars 35 forks source link

Output error: Exceeded step memory limit at some point #277

Closed sumeshpt closed 2 months ago

sumeshpt commented 1 year ago

The attached input produces this error at the end of the run, though it seems to produce the desired output. Not sure whether it has consequences.

input.txt

kevinstratford commented 1 year ago

I suspect this a failure to allocate memory at the point where the configuration output is written at the end of the run.

It's a bit unfortunate that you have to wait that long to find out. We should investigate a way to bring the problem forward to the start of the run.

kevinstratford commented 3 months ago

I will find a way to prevent this issue (or at least run out of memory at the start of the run).

kevinstratford commented 3 months ago

This is a problem with out-of-memory (OOM) and as such cannot be fixed or handled. The only thing we can really do is to try to fail early in the proceedings.

So I've added some code to do this by running the i/o aggregation step once at initialisation. This is not completely foolproof, but reasonable examples suggest it does bring the failure forward.

There is a risk that this extra memory requirement at the start induces failure when no output is actually wanted. However, I think we will accept that, lesser, evil.

This is #318

kevinstratford commented 2 months ago

That should be ok (famous last words)....