Closed simleo closed 8 months ago
https://ogf.org/documents/GFD.204.pdf had many good keys we can try to reuse. It's a bit outdated as it's pre-cloud.
https://slides.com/farahzkhan/2018-01-15-interoperable-provenance/fullscreen#/12/1 some earlier ideas from CWL uses GFD 204 but that would put it in a different XML file (unless we re-use the namespace for propertyId
):
<?xml version="1.0" encoding="UTF-8"?>
<ur:UsageRecord xmlns="http://schema.ogf.org/urf/2013/04/urf"
xmlns:ur="http://schema.ogf.org/urf/2013/04/urf" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://schema.ogf.org/urf/2013/04/urf">
<ur:RecordIdentityBlock>
<ur:RecordId>urn:uuid:4350d583-61a5-45e8-a229-957aa81e8014</ur:RecordId>
<ur:CreateTime>2018-05-09T09:06:52Z</ur:CreateTime>
<ur:Site>EMBL-EBI</ur:Site>
<ur:Infrastructure>Embassy</ur:Infrastructure>
</ur:RecordIdentityBlock>
<ur:SubjectIdentityBlock>
<ur:LocalUserId>stain</ur:LocalUserId>
<ur:LocalGroupId>ELIXIRCWLImplStudy</ur:LocalGroupId>
<ur:GlobalUserId>https://orcid.org/0000-0001-9842-9718</ur:GlobalUserId>
</ur:SubjectIdentityBlock>
<ur:ComputeUsageBlock>
<ur:CpuDuration>PT3600S</ur:CpuDuration>
<ur:WallDuration>PT3600S</ur:WallDuration>
<ur:StartTime>2018-05-31T11:00:00</ur:StartTime>
<ur:EndTime>2018-05-31T12:00:00</ur:EndTime>
<ur:ExecutionHost>
<ur:Hostname>compute-0-1.example.com</ur:Hostname>
<ur:ProcessId>1042</ur:ProcessId>
<ur:Benchmark ur:type="si2k">3.14</ur:Benchmark>
</ur:ExecutionHost>
<ur:Processors>4</ur:Processors>
<ur:NodeCount>1</ur:NodeCount>
</ur:ComputeUsageBlock>
<ur:JobUsageBlock>
<ur:GlobalJobId>host.example.org/ab1234</ur:GlobalJobId>
<ur:LocalJobId>ab1234</ur:LocalJobId>
<ur:JobName>MetaGenomics1337</ur:JobName>
<ur:Queue ur:description="execution">"Bigmem"</ur:Queue>
<ur:TimeInstant ur:type="Ctime">2018-05-31T10:30:00</ur:TimeInstant>
<ur:TimeInstant ur:type="Qtime">2018-05-31T10:31:00</ur:TimeInstant>
<ur:TimeInstant ur:type="Etime">2018-05-31T10:59:42</ur:TimeInstant>
</ur:JobUsageBlock>
<ur:MemoryUsageBlock>
<ur:MemoryClass>"RAM"</ur:MemoryClass>
<ur:MemoryResourceCapacityUsed>14728</ur:MemoryResourceCapacityUsed>
<ur:MemoryResourceCapacityAllocated>56437</ur:MemoryResourceCapacityAllocated>
<ur:MemoryResourceCapacityRequested>42000</ur:MemoryResourceCapacityRequested>
<ur:StartTime>2018-05-31T11:00:00</ur:StartTime>
<ur:EndTime>2018-05-31T12:00:00</ur:EndTime>
</ur:MemoryUsageBlock>
<ur:StorageUsageBlock>
<ur:StorageShare>pool-003</ur:StorageShare>
<ur:StorageMedia>disk</ur:StorageMedia>
<ur:StorageClass>replicated</ur:StorageClass>
<ur:DirectoryPath>/projectA</ur:DirectoryPath>
<ur:FileCount>42</ur:FileCount>
<ur:StorageResourceCapacityUsed>14728</ur:StorageResourceCapacityUsed>
<ur:StorageLogicalCapacityUsed>13617</ur:StorageLogicalCapacityUsed>
<ur:StorageResourceCapacityAllocated>14624
</ur:StorageResourceCapacityAllocated>
<ur:StartTime>2018-05-07T09:31:40Z</ur:StartTime>
<ur:EndTime>2018-05-08T09:29:42Z</ur:EndTime>
<ur:Host>host.example.org</ur:Host>
</ur:StorageUsageBlock>
<ur:CloudUsageBlock>
<ur:LocalVirtualMachineId>ab1234</ur:LocalVirtualMachineId>
<ur:GlobalVirtualMachineId>
host.example.org/ab1234/2018-05-09T09:06:52Z
</ur:GlobalVirtualMachineId>
<ur:Status>started</ur:Status>
<ur:SuspendDuration>PT3600S</ur:SuspendDuration>
<ur:ImageId>UbuntuImage2013</ur:ImageId>
<ur:MachineName>cloud.example.org</ur:MachineName>
<ur:SubmitHost>
cloud-name=cloud.example.org,Mds-Vo-name=local,o=cloud
</ur:SubmitHost>
<ur:TimeInstant ur:type="Ctime">2018-05-31T10:30:00</ur:TimeInstant>
<ur:TimeInstant ur:type="Qtime">2018-05-31T10:31:00</ur:TimeInstant>
<ur:TimeInstant ur:type="Etime">2018-05-31T10:59:42</ur:TimeInstant>
<ur:ServiceLevel>Premium</ur:ServiceLevel>
</ur:CloudUsageBlock>
<ur:NetworkUsageBlock>
<ur:NetworkClass ur:NetworkResourceBandwidth="100000000">"Ethernet"</ur:NetworkClass>
<ur:NetworkInboundUsed ur:SourceAddress=192.168.1.12>14728</ur:NetworkInboundUsed>
<ur:NetworkOutboundUsed ur:DestinationAddress=192.168.1.21>14728</ur:NetworkOutboundUsed>
</ur:NetworkUsageBlock>
</ur:UsageRecord>
This relates to #32, can we merge this and then try to formalize for the profile description? Does this belong only in Provenance profile?
This relates to #32, can we merge this and then try to formalize for the profile description? Does this belong only in Provenance profile?
It actually relates to #10 (retrospective provenance). I'll merge this since it has your (implicit) approval. Regarding which profile it belongs to, I would say Process and Provenance (individual tool runs) since it's clearer how to associate such parameters to single process runs. For workflow runs as a whole it's less straightforward: should mean values, or sums, or something else be reported?
Try a representation of resource usage in a Workflow Run RO-Crate. See the
README.md
.