katylettuce / beast-mcmc

Automatically exported from code.google.com/p/beast-mcmc
0 stars 0 forks source link

Problems using Beast with blcr checkpointing #273

Open GoogleCodeExporter opened 9 years ago

GoogleCodeExporter commented 9 years ago
Would it be nice BEAST to support this?
---------------------------------------------------------------------

Hello all,

we are running beast as one of the applications on our life
sciences cluster here at BYU and we've been trying to integrate job
checkpointing with blcr.  This works fairly well with beast, but we've
run into problems with the hsperfdata file in /tmp. For checkpointing
to work, all the program files need to remain where they were when
they were checkpointed.

There are a few problems with this file: first, it is deleted as soon
as the beast process dies, making it impossible to restart the job
unless you keep copy of the file older than the last checkpoint;
second, it is in the /tmp directory. We need it accessible across the
entire cluster so we need to be able to specify a different location
for it.  I thought the -working flag for beast would do this, but it
does not. Last of all, the actual hsperfdata file is named after the
pid number; and the pid number may change after the job is restarted.
Depending on how the application was programmed, this last issue may
or may not be a problem.

We've found this tool to be extremely useful, especially with the
addition of multi-threaded support.  Please let me know if there's any
additional details you need to know about our problem.  Thanks for all
your work.

Danny Challis
BYU LfSci Computer Support 

Original issue reported on code.google.com by dong.w.xie@gmail.com on 22 Dec 2009 at 9:30

GoogleCodeExporter commented 9 years ago
The hsperfdata file can be turned off using the command line option:

-XX:-UsePerfData 

to the JVM. However, this file is used by the JVM for the performance tuning 
and turning it off might hurt 
performance.

Original comment by ramb...@gmail.com on 22 Dec 2009 at 9:42

GoogleCodeExporter commented 9 years ago
That should work, thank you!

Original comment by dannyand...@gmail.com on 22 Dec 2009 at 10:43