Closed hjoliver closed 11 years ago
I'll probably start by populating a sqlite file with:
I imagine it would be quite straightforward to populate the database - some INSERT and UPDATE statements where the system currently writes to the log. We can then think of some good ways to use the database.
Yeah, that looks good Matt.
The current planned implementation for this is as follows:
Each suite will have its own database in the form of an sqlite3 file containing the tables outlined by Matt above - though more tables could be added as required at a later date. The main interaction with the database will be handled by code added to the task class.
On initialising a task proxy a row will be inserted in the task state table for recording the task state. At each task event, a row will be inserted into the task event table. When the task state changes the entry in the state table will be updated accordingly.
At present the focus is on implementing the database providing this alternate log of events. Once implemented and working Matt feels there should be plenty of uses for it. See e.g. Matt's comment in #25.
All - it seems to me the task event db should hold dynamic information: task state, messages, try number, Loadleveler job ID (or similar), and so on. The suggested db tables above contain a couple of static items from the suite definition too: submission method and task host. Matt and Andy, do you have a particular use in mind for these? I ask because, with reference to this comment: https://github.com/cylc/cylc/issues/108#issuecomment-10498633, I think taking task runtime properties out of the task proxies, to be loaded/computed (inheritance) for each task at the last second before job submission, would greatly reduce initial suite parsing time and cylc memory usage in large suites.
Broadcast settings will also have to held in the db, to be reloaded on restarting the suite. Currently these are stored as an inline pickle string in the state dump file (ugh!)
On task host: Now I think about it, we need to store USER@HOST
instead of just the task host. The other thing is that the task host can be a shell string containing a command. E.g. "my_host_$(get_number)"
. The returned host name needs to be stored so that we can use the database to pull back output from the recorded host.
Make sense to store broadcast settings and those settings specified in cylc -run --set=X
in the database as well.
Now the basic db implementation from #198 is in the master we can look at filling in gaps in the database. Still to do:
USER@HOST
- Currently looking at thissubmit_method
and submit_method_id
e.g. LoadLeveler and LoadLeveler job ID - this information is currently not available to the task proxy - @hjoliver any thoughts on this?submit_method
is available: rtconfig['job submission']['method']
(also the module name for the "launcher" object used to submit the job.submit_method_id
- yes, not currently available. It seems to me that support for reading this ID (e.g. get the command or scripting used to submit the job to return the ID?) may need to be part of the specific job submission modules (loadleveler, qsub, etc. - note we need to support methods other than loadleveler as well) and at least initially we should support the job ID not being available too - in case there are some job submission methods for which we cannot easily get the job ID, or at least don't currently know how to. USER@HOST
functionality and records submit_method
.Moving recording Broadcast settings and submit_method_id
to separate issues.
@matthewrmshin - can you close this issue?
Most issues implemented in #198 and #257. Remaining issues moved to #261 and #262.
From @dpmatthews: