libMesh / libmesh

libMesh github repository
http://libmesh.github.io
GNU Lesser General Public License v2.1
653 stars 286 forks source link

libMesh::System::read_parallel_data expecting the same number of additional vectors in the file #1348

Open YaqiWang opened 7 years ago

YaqiWang commented 7 years ago

which makes sense because typically the file is generated by a calculation with the same equation system. But now we have a situation, where the initial condition is set by the solution of a steady state calculation which has the same number of primal variables. In this case, there could be a file generated by the steady-state calculation, that does not have additional vectors like u_dot, etc. When I try to restart the transient calculation with the steady-state check point file, I get an error

Additional vectors in file do not match system

The fix could be simply reading and set the vectors in the file and in the system. Any comments?

permcody commented 7 years ago

Better question. Are you saving any memory? On Thu, May 18, 2017 at 5:52 PM Yaqi notifications@github.com wrote:

which makes sense because typically the file is generated by a calculation with the same equation system. But now we have a situation, where the initial condition is set by the solution of a steady state calculation which has the same number of primal variables. In this case, there could be a file generated by the steady-state calculation, that does not have additional vectors like u_dot, etc. When I try to restart the transient calculation with the steady-state check point file, I get an error

Additional vectors in file do not match system

The fix could be simply reading and set the vectors in the file and in the system. Any comments?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/libMesh/libmesh/issues/1348, or mute the thread https://github.com/notifications/unsubscribe-auth/AC5XIKLLVhIGB1xB81guvKaIIaHE2DEeks5r7NmhgaJpZM4Nf7K7 .

YaqiWang commented 7 years ago

I have to run a steady state calculation to set the initial condition for the transient. The steady state calculation takes lots of time and I want to save its solution so that I do not have to re-run it every time I restart a transient calculation with slightly changing a transient parameter.

The steady state calculation would require no addition vectors in an ideal world, but transient requires old, older solutions and bunch of aux vectors like u_dot. I have my own iterative solver, instead of GMRes, to me, yes, the memory saving is big on the steady state solve if I have to create those vectors just for restarting purpose. You may argue that if you can run transient, you should be able to run steady with the extra unnecessary vectors. You may also argue that you can just restart the particular variable which requires the initial condition. For the first, I guess I just do not want to see unnecessary memory usage because my problem is huge, I do not want to use lots of resources on clusters shared by others. For the second, exodus file does not support restart well and I am not sure if it works with distributed mesh.

roystgnr commented 7 years ago

Possible short-term fix: can you run the restart with READ_ADDITIONAL_DATA unset? Or are there additional vectors which you do need to have set in the restart?

Long-term, this is something we can make work even if you only need to read a subset of vectors from a restart, but I don't think we can make it work without a xdr file format increment. We're currently only writing vector names as comments, which IIRC just get thrown away when writing in binary rather than ascii mode, and without either an identical set of vectors or at least a set of names to match up we have no way of knowing which vector's data goes where.

YaqiWang commented 7 years ago

Oh, maybe not, only the solution of primal variables are needed. The default values of those additional vectors are zero, right? If so, that is probably what I need. How do we unset READ_ADDITIONAL_DATA?

roystgnr commented 7 years ago

The default values of unread additional vectors are zero.

In libMesh, you'd just call EquationSystems::read() without setting READ_ADDITIONAL_DATA in the flags.

In MOOSE... it looks like Resurrector.C is where the relevant call is, the flag is hard coded, and there doesn't yet seem to be an input file option to change that.

YaqiWang commented 7 years ago

I see. We can easily add a flag or input parameter in FEProblem. That will be an easy fix, then possibly we do not need to do anything in libMesh. I will try it out in MOOSE.

YaqiWang commented 7 years ago

I am closing this because what @roystgnr suggested works. Thanks.

roystgnr commented 7 years ago

Thanks for the update! I'm going to reopen just to keep it in mind, though. Just because my workaround was successful for you doesn't mean it's going to be sufficient for everyone.