ImagingDataCommons / CloudSegmentator

Medical imaging segmentation workflows for FireCloud (Terra) and Seven Bridges Cancer Genomics Cloud
Apache License 2.0
3 stars 2 forks source link

reconfigure cwl files and use yaml manifests for sbcgc #51

Closed vkt1414 closed 7 months ago

vkt1414 commented 7 months ago

CWL files are edited to work with the new workflow i.e starting with a list of SeriesInstanceUIDs. However since sbcgc can not be multiplexed in any other way other than files or file metadata, yaml files containing list of seriesinstanceuids are passed. Accordingly, preprocessing notebook is updated to generate yaml sample manifests for sbcgc and the sample manifest has also been updated.

I had been using native gsutil for copying extracted files to gcp buckets, decided to give s5cmd a try with hmac credentials and at least one of the two steps are faster now.

A minor typo in dicomsegsr notebook is also addressed.

With this, I believe I addressed everything I wanted to before we can create a tag.

review-notebook-app[bot] commented 7 months ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

fedorov commented 7 months ago

@vkt1414 how is the HMAC key passed to the workflow? Did you make a dedicated service account for creating that key?

vkt1414 commented 7 months ago

@vkt1414 how is the HMAC key passed to the workflow? Did you make a dedicated service account for creating that key?

The post processing notebook is only run on a VM or our own PC. So, we only need to enter them at the time of execution manually.

And I created the HMAC key from interoperability settings on Cloud Storage

fedorov commented 7 months ago

And I created the HMAC key from interoperability settings on Cloud Storage

Makes sense. For the future, you may consider generating those for SAs - it is a better security practice to create those HMAC keys for dedicated SAs with minimum limited privileges.