Closed GoogleCodeExporter closed 9 years ago
Will see if have a single sqlite connection per pipeline_run /
pipeline_printout helps...
Original comment by bunbu...@gmail.com
on 23 Nov 2013 at 1:51
These are the latest changes to Ruffus which should hopefully fix this problem:
1) Only a single connection is made to the sqlite database file at a time (for
any run of the pipeline) so file-locking problems on network file systems such
as Lustre should hopefully be ameliorated.
2) The history file used for pipeline_run, pipeline_printout and
pipeline_printout_graph can be set to another location, e.g. on a local drive:
pipeline_run(.., history_file = "XXX", ...)
(Only using the temp drive sorts of defeats the whole purpose of recording
which files have run successfully in the pipeline.)
3) The default history file location can be set in
ruffus.ruffus_utility.RUFFUS_HISTORY_FILE
4) The default history file location can be overridden by the environmental
varible DEFAULT_RUFFUS_HISTORY_FILE
5) The default history file location can use path expansion to automatically
give each script its own independent history file. This is the safest and
easiest alternative to using a history file in the local directory.
So if the environment variable is:
export DEFAULT_RUFFUS_HISTORY_FILE=.{basename}.ruffus_history.sqlite
Then the job history database for "run.me.py" will be
".run.me.ruffus_history.sqlite"
All the scripts can be set to a single directory by using:
export DEFAULT_RUFFUS_HISTORY_FILE=/your/path/.{basename}.ruffus_hist.sqlite
If you are really paranoid about name clashes, you can use:
export DEFAULT_RUFFUS_HISTORY_FILE=/your/path/{path}/.{basename}.sqlite
In which case, the history file for "/test/bin/scripts/run.me.py" will be:
/your/path/test/bin/scripts/.run.me.sqlite
Original comment by bunbu...@gmail.com
on 16 Dec 2013 at 6:33
Original issue reported on code.google.com by
tyler.fu...@gmail.com
on 22 Nov 2013 at 3:07