Open GoogleCodeExporter opened 9 years ago
Issue 27 has been merged into this issue.
Original comment by riccardo.murri@gmail.com
on 18 Nov 2010 at 1:06
lrms_jobid <str>
lrms_jobname <str> (not identical)
resource_name <str>
timestamp
stderr_filename
stdout_filename
gc3 sge src
queue => qname = queue
cores => sge_slots = arc_cpu_count
exitcode => exit_status/failed = exitcode/status
used_walltime(sec.) => ru_wallclock = used_walltime
used_cputime(sec.) => cpu = used_cputime
used_memory(kB) => maxvmem(B) = used_memory(kB)
SGE
- maxvmeme => float with M at the end
- cpu => float (in sec.)
- wallclock => int in sec.
ARC
- used_memory => int KiB
- used_cputime = int sec.
- used_walltime => int sec.
Original comment by sergio.m...@gmail.com
on 17 Dec 2010 at 6:05
Original comment by sergio.m...@gmail.com
on 12 Jan 2011 at 10:18
Original comment by riccardo.murri@gmail.com
on 9 Feb 2011 at 9:47
Original comment by riccardo.murri@gmail.com
on 9 Mar 2011 at 3:10
Original comment by riccardo.murri@gmail.com
on 1 Jul 2011 at 2:31
The ARC Grid-Manager collects this information from the executed jobs (via GNU
Time):
WallTime
KernelTime
UserTime
CPUUsage
MaxResidentMemory
AverageResidentMemory
Average
TotalMemory
AverageUnsharedMemory
AverageUnsharedStack
AverageSharedMemory
PageSize
MajorPageFaults
MinorPageFaults
Swaps
ForcedSwitches
WaitSwitches
Inputs
Outputs
SocketReceived
SocketSent
Signals
Since this is all available via GNU `time`, the `shellcmd` backend can
use it as well and we could probably gather the same information from
SGE/PBS/LSF backends as well.
I suggest we take this list as a starting point; some values will not
be available on all platforms (e.g., `time` consistenly reports memory
as "0" on Linux kernels < 2.6.32).
Original comment by riccardo.murri@gmail.com
on 20 Jun 2012 at 6:51
in arclib (ARC0) a, arclib.Job is described by the following attributes:
['client_software', 'cluster', 'comment', 'completion_time', 'cpu_count',
'erase_time', 'errors', 'execution_nodes', 'exitcode', 'gmlog', 'id',
'job_name', 'mds_validfrom', 'mds_validto', 'owner', 'proxy_expire_time',
'queue', 'queue_rank', 'requested_cpu_time', 'requested_wall_time',
'rerunable', 'runtime_environments', 'sstderr', 'sstdin', 'sstdout', 'status',
'submission_time', 'submission_ui', 'used_cpu_time', 'used_memory',
'used_wall_time']
in ARC2 (I would skip ARC1) the arc.Job is described by the following
attributes:
['Cluster', 'ComputingManagerEndTime', 'ComputingManagerExitCode',
'ComputingManagerSubmissionTime', 'CreationTime', 'EndTime', 'ExecutionNode',
'ExitCode', 'InterfaceName', 'JobDescriptionDocument', 'JobID',
'LocalInputFiles', 'LocalOwner', 'LocalSubmissionTime', 'Name',
'OtherMessages', 'Owner', 'ProxyExpirationTime', 'Queue',
'RequestedApplicationEnvironment', 'RequestedSlots', 'RequestedTotalCPUTime',
'RequestedTotalWallTime', 'StartTime', 'State', 'StdErr', 'StdIn', 'StdOut',
'SubmissionClientName', 'SubmissionHost', 'SubmissionTime', 'UsedCPUType',
'UsedMainMemory', 'UsedOSFamily', 'UsedPlatform', 'UsedTotalCPUTime',
'UsedTotalWallTime', 'UserDomain', 'Validity', 'VirtualMachine',
'WaitingPosition', 'WorkingAreaEraseTime' ]
this is what we have at our disposal when updating an arc.Job object, what the
grid-manager collects for the usage records is something we cannot access (at
least to my knowledge)
Sergio :)
Original comment by sergio.m...@gmail.com
on 21 Jun 2012 at 10:45
| what the grid-manager collects for the usage records is something we cannot
| access (at least to my knowledge)
Well, there's the contents of the "diag" file in the ".arc" directory.
(Although that might be rightfully considered an implementation detail
and changed in the future.)
Original comment by riccardo.murri@gmail.com
on 21 Jun 2012 at 10:50
Original comment by riccardo.murri@gmail.com
on 10 Jul 2012 at 9:50
Original comment by riccardo.murri@gmail.com
on 17 Aug 2012 at 11:46
This issue was updated by revision r2767.
After a job reaches the `TERMINATED` state, the following
attributes are also set. The meaning and format of the attributes
is consistent across backends.
`execution.duration`
Time lapse from start to end of the job at the remote
execution site, as a `gc3libs.quantity.Duration`:class: value.
(This is also often referred to as the 'wall-clock time' or
`walltime`:term: of the job.)
`execution.max_used_memory`
Maximum amount of RAM used during job execution, represented
as a `gc3libs.quantity.Memory`:class: value.
`execution.used_cpu_time`
Total time (as a `gc3libs.quantity.Duration`:class: value) that the
processors has been actively executing the job's code.
Backends may set other attributes as well; the only convention is that
the name of the attribute starts with the (lowercased) backend name.
For instance, the PbsLrms backend sets attributes `pbs_queue`,
`pbs_end_time`, etc.
The triple (submission time, actual start time, end time) would be a
very useful addition to this set of common attributes, but not all
backends provide this information. (PBS, SGE and ARC1 do; ARC0 does
not have the "actual start time"). What would be a good way to handle
this?
Original comment by riccardo.murri@gmail.com
on 21 Sep 2012 at 10:22
Original issue reported on code.google.com by
sergio.m...@gmail.com
on 17 Nov 2010 at 1:32