ReactionMechanismGenerator / RMG-Java

The Java version of RMG: Reaction Mechanism Generator
http://rmg.sourceforge.net/
MIT License
29 stars 36 forks source link

Reading of Restart files is slow #168

Open rwest opened 13 years ago

rwest commented 13 years ago

This was mentioned in the meeting today. Specifically, there was talk of assuming that all restart files are well-formed and have unique reactions, so that you don't need to check them all, which apparently is slow. (Or perhaps this process could be sped up some other way - I have not looked into it).

ramanan commented 11 years ago

I ran into this issue when trying to dump and restart after the model had grown fairly large. Here are some observations.

It was around 15% of the total memory usage reported. When the total memory footprint was 800MB, reactionModel was ~120MB and the size of the restart files was 40MB. Binary I/O on 120MB would be way faster than formatted I/O on 40MB. May be there is an object/container smaller than 120MB that contains the necessary and sufficient data to be able to restart.

nickvandewiele commented 11 years ago

thanks for this, this is very interesting.

I agree that the Restart functionality could have been serialized in the first place.I believe it is actually done in RMG-Py through pickle now, for the reasons you mention.

On top of that, "restart" code had to be updated each time a new features were introduced in other parts of the code. I think of P-dep kinetics formats.

One pro of "restart" ís actually the human readability, where you can easily inspect the edge without hacking into code. I believe this is still the only way to do so.

Also, the fact that the format of the "restart" files closely ressembles the format seed mechs, is a nice-to-have, if suddenly "restart" functionality does not behave as it should! :)

rwest commented 11 years ago

Binary restart sounds logical to me if it can be straightforwardly implemented (a bold assumption).

Personally, I seldom use the restart feature, because usually if a job dies there's a reason for it and I don't want to restart it the same way.

Besides Nick's points, another feature we may risk losing (if indeed it exists) is the ability to change the conditions or settings from one run to the next. I'm not sure if this is widely used (I think @mrharper may have used it, but according to [1] he used the Seed Mechanism feature instead - although as Nick says, these share a lot of code). So maybe this is not a problem.

[1] Harper, M. R.; Van Geem, K. M.; Pyl, S. P.; Marin, G. B.; Green, W. H. “Comprehensive reaction mechanism for n-butanol pyrolysis and combustion.” Combust. Flame 2011, 158, 16–41 doi:10.1016/j.combustflame.2010.06.002.