ResearchObject / workflow-run-crate

Workflow Run RO-Crate profile
https://www.researchobject.org/workflow-run-crate/
Apache License 2.0
8 stars 9 forks source link

Add ML predict pipeline crate obtained with cwltool and runcrate #38

Closed simleo closed 1 year ago

simleo commented 1 year ago

@GlassOfWhiskey I've compared the metadata with those from the StreamFlow crate using consume_crate.py. Here are the outputs:

StreamFlow crate:

action #2ac884e6-3763-4893-956e-53446acb22d9
  instrument: predictions.cwl (['SoftwareSourceCode', 'ComputationalWorkflow', 'HowTo', 'File'])
  started: 2022-11-26T22:37:00.579914+00:00
  ended: 2022-11-26T22:37:08.757424+00:00
  inputs:
    predictions.cwl#gpu: 
    predictions.cwl#slide: #19168124-bb17-4926-be46-2b53ee4f28ec
    predictions.cwl#tissue-high-batch-size: 
    predictions.cwl#tissue-high-chunk-size: 
    predictions.cwl#tissue-high-filter: tissue_low>0.9
    predictions.cwl#tissue-high-label: tissue_high
    predictions.cwl#tissue-high-level: 4
    predictions.cwl#tissue-low-batch-size: 
    predictions.cwl#tissue-low-chunk-size: 
    predictions.cwl#tissue-low-label: tissue_low
    predictions.cwl#tissue-low-level: 9
    predictions.cwl#tumor-batch-size: 
    predictions.cwl#tumor-chunk-size: 
    predictions.cwl#tumor-filter: tissue_low>0.99
    predictions.cwl#tumor-label: tumor
    predictions.cwl#tumor-level: 1
    ???: f62aa607a75508ac5fc6a22e9c0e39ef58a2c852
  outputs:
    predictions.cwl#tissue: 
    predictions.cwl#tumor: 

action #99b9f647-d1eb-4ef1-b6b5-6d9ee1f585dc
  step: predictions.cwl#extract-tissue-low
  instrument: extract_tissue.cwl (SoftwareApplication)
  started: 2022-11-26T22:37:02.353244+00:00
  ended: 2022-11-26T22:37:03.413674+00:00
  inputs:
    extract_tissue.cwl#batch-size: 
    extract_tissue.cwl#chunk-size: 
    extract_tissue.cwl#filter: tissue_low>0.9
    extract_tissue.cwl#filter_slide: a9805401269ade2a88b56d0e9a0b389041bebdff
    extract_tissue.cwl#gpu: 
    extract_tissue.cwl#label: tissue_high
    extract_tissue.cwl#level: 4
    extract_tissue.cwl#src: #1c5046dc-ce75-4e06-9153-a72147d5ef1f
  outputs:
    extract_tissue.cwl#tissue: 741c6947917667777aafae95c3252d2a20bd88f5

action #3e9e1d9e-fba5-4ebe-970c-2cc140b41afe
  step: predictions.cwl#extract-tissue-high
  instrument: extract_tissue.cwl (SoftwareApplication)
  started: 2022-11-26T22:37:04.834306+00:00
  ended: 2022-11-26T22:37:06.353684+00:00
  inputs:
    extract_tissue.cwl#batch-size: 
    extract_tissue.cwl#chunk-size: 
    extract_tissue.cwl#filter: tissue_low>0.9
    extract_tissue.cwl#filter_slide: a9805401269ade2a88b56d0e9a0b389041bebdff
    extract_tissue.cwl#gpu: 
    extract_tissue.cwl#label: tissue_high
    extract_tissue.cwl#level: 4
    extract_tissue.cwl#src: #1c5046dc-ce75-4e06-9153-a72147d5ef1f
  outputs:
    extract_tissue.cwl#tissue: 741c6947917667777aafae95c3252d2a20bd88f5

action #5b00fedf-f183-44f4-9355-70a7c1df8e96
  step: predictions.cwl#classify-tumor
  instrument: classify_tumor.cwl (['SoftwareApplication', 'File'])
  started: 2022-11-26T22:37:07.333229+00:00
  ended: 2022-11-26T22:37:07.925294+00:00
  inputs:
    classify_tumor.cwl#batch-size: 
    classify_tumor.cwl#chunk-size: 
    classify_tumor.cwl#filter: tissue_low>0.99
    classify_tumor.cwl#filter_slide: a9805401269ade2a88b56d0e9a0b389041bebdff
    classify_tumor.cwl#gpu: 
    classify_tumor.cwl#label: tumor
    classify_tumor.cwl#level: 1
    classify_tumor.cwl#src: #19168124-bb17-4926-be46-2b53ee4f28ec
  outputs:
    classify_tumor.cwl#tumor: e75ee6d0751bb2dbc7a894af0fe307e4fe020a99

cwltool + runcrate crate

action #e8a7d915-6450-458d-aef6-a15ff3b9e5bf
  instrument: packed.cwl (['File', 'SoftwareSourceCode', 'ComputationalWorkflow', 'HowTo'])
  started: 2022-12-07T14:56:06.279392
  ended: 2022-12-07T14:56:22.019166
  inputs:
    packed.cwl#main/gpu: 
    packed.cwl#main/slide: f62aa607a75508ac5fc6a22e9c0e39ef58a2c852
    packed.cwl#main/tissue-high-batch-size: 
    packed.cwl#main/tissue-high-chunk-size: 
    packed.cwl#main/tissue-high-filter: tissue_low>0.9
    packed.cwl#main/tissue-high-label: tissue_high
    packed.cwl#main/tissue-high-level: 4
    packed.cwl#main/tissue-low-batch-size: 
    packed.cwl#main/tissue-low-chunk-size: 
    packed.cwl#main/tissue-low-label: tissue_low
    packed.cwl#main/tissue-low-level: 9
    packed.cwl#main/tumor-batch-size: 
    packed.cwl#main/tumor-chunk-size: 
    packed.cwl#main/tumor-filter: tissue_low>0.99
    packed.cwl#main/tumor-label: tumor
    packed.cwl#main/tumor-level: 1
  outputs:
    packed.cwl#main/tissue: 303fde451481d0f69e65c9ddca50e950a40f9e30
    packed.cwl#main/tumor: f867a592e3b07d4c08b846de6444b8904974a485

action #53911704-594f-4978-b9b6-018d40408843
  step: packed.cwl#main/extract-tissue-low
  instrument: packed.cwl#extract_tissue.cwl (SoftwareApplication)
  started: 2022-12-07T14:56:07.415465
  ended: 2022-12-07T14:56:09.333476
  inputs:
    packed.cwl#extract_tissue.cwl/batch-size: 
    packed.cwl#extract_tissue.cwl/chunk-size: 
    packed.cwl#extract_tissue.cwl/filter: 
    packed.cwl#extract_tissue.cwl/filter_slide: 
    packed.cwl#extract_tissue.cwl/gpu: 
    packed.cwl#extract_tissue.cwl/label: tissue_low
    packed.cwl#extract_tissue.cwl/level: 9
    packed.cwl#extract_tissue.cwl/src: #a53d39e7-cbb3-49c3-8e0b-453154815aae
  outputs:
    packed.cwl#extract_tissue.cwl/tissue: 24cebd1a6599c0e21ea98c28aabafe4ef5b2f345

action #dda41bd9-9d18-457a-b301-369430fec72f
  step: packed.cwl#main/extract-tissue-high
  instrument: packed.cwl#extract_tissue.cwl (SoftwareApplication)
  started: 2022-12-07T14:56:09.340412
  ended: 2022-12-07T14:56:10.908178
  inputs:
    packed.cwl#extract_tissue.cwl/batch-size: 
    packed.cwl#extract_tissue.cwl/chunk-size: 
    packed.cwl#extract_tissue.cwl/filter: tissue_low>0.9
    packed.cwl#extract_tissue.cwl/filter_slide: 24cebd1a6599c0e21ea98c28aabafe4ef5b2f345
    packed.cwl#extract_tissue.cwl/gpu: 
    packed.cwl#extract_tissue.cwl/label: tissue_high
    packed.cwl#extract_tissue.cwl/level: 4
    packed.cwl#extract_tissue.cwl/src: #a53d39e7-cbb3-49c3-8e0b-453154815aae
  outputs:
    packed.cwl#extract_tissue.cwl/tissue: 303fde451481d0f69e65c9ddca50e950a40f9e30

action #aa2aaacf-f2a1-4276-b506-7581bbd311c9
  step: packed.cwl#main/classify-tumor
  instrument: packed.cwl#classify_tumor.cwl (SoftwareApplication)
  started: 2022-12-07T14:56:10.923463
  ended: 2022-12-07T14:56:22.009273
  inputs:
    packed.cwl#classify_tumor.cwl/batch-size: 
    packed.cwl#classify_tumor.cwl/chunk-size: 
    packed.cwl#classify_tumor.cwl/filter: tissue_low>0.99
    packed.cwl#classify_tumor.cwl/filter_slide: 24cebd1a6599c0e21ea98c28aabafe4ef5b2f345
    packed.cwl#classify_tumor.cwl/gpu: 
    packed.cwl#classify_tumor.cwl/label: tumor
    packed.cwl#classify_tumor.cwl/level: 1
    packed.cwl#classify_tumor.cwl/src: #a53d39e7-cbb3-49c3-8e0b-453154815aae
  outputs:
    packed.cwl#classify_tumor.cwl/tumor: f867a592e3b07d4c08b846de6444b8904974a485

Here are the issues that I've spotted:

Streamflow crate:

cwltool + runcrate crate:

Note that I've recently moved the actual input data to docs/examples/data/Mirax2-Fluorescence-2 and now all crates / RO bundles have symlinks to those files (they're pretty big, so we don't want to duplicate them). docs/examples/data/link_mirax.py can help replace the actual files with the symlinks.

simleo commented 1 year ago

Merging this, we can add the alternateNames in a future update