Closed rick-ji closed 4 years ago
Hi Rick. Apologies for the delayed response.
So while Cromwell itself has native support for CWL, we have focused our efforts in building out this solution based on WDL. With that being said, we are running some tests to test the general capability of CWL files to be properly processed with the Task Execution Service (TES). I'm making some adjustments to a CWL, input.json, and trigger file I've written for doing an alignment workflow and will report back this afternoon. If it works, I'll add a page to the documentation that addresses CWLs specifically for future reference.
Hi Rick, I have an update for you.
I've tried running CWL workflows with our Cromwell on Azure implementation, and currently our implementation of TES does not properly support file structures expected for running a CWL. We will begin working on a fix for our implementation of TES to support CWL files natively.
In the interim, I've also looked into possibly running a conversion tool, and the 2 that I found are unfortunately abandoned code and do not convert CWL 1.0 standard files correctly. So at this time, I'd have to say that until we build out the native CWL support, running CWL files with Cromwell on Azure is unsupported at this time. I will keep you updated with progress on this.
Thank you very much for the note! I too have tried one of the cwl2wdl repo, it didn’t work for me either. Will wait to hear for the new update on cwl support for the time being I’ll see if I can rewrite cwl into wdl
Thanks Rick Sent from mobile
From: Roberto Antonio Lleras notifications@github.com Sent: Friday, April 17, 2020 9:55:38 AM To: microsoft/CromwellOnAzure CromwellOnAzure@noreply.github.com Cc: rick-ji jixin85@outlook.com; Author author@noreply.github.com Subject: Re: [microsoft/CromwellOnAzure] How to run CWL flows (#65)
Hi Rick, I have an update for you.
I've tried running CWL workflows with our Cromwell on Azure implementation, and currently our implementation of TES does not properly support file structures expected for running a CWL. We will begin working on a fix for our implementation of TES to support CWL files natively.
In the interim, I've also looked into possibly running a conversion tool, and the 2 that I found are unfortunately abandoned code and do not convert CWL 1.0 standard files correctly. So at this time, I'd have to say that until we build out the native CWL support, running CWL files with Cromwell on Azure is unsupported at this time. I will keep you updated with progress on this.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/microsoft/CromwellOnAzure/issues/65#issuecomment-614956720, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANWZVRHXRIWKBENNGJJA4UTRM6LHVANCNFSM4MHQ5RCA.
Hi Rick,
I have an update for you. I'm happy to report that we've looked into the Cromwell architecture and TES and figured out a working solution for utilizing CWL files with Azure. There's a few key pieces of information:
1) You have to provide any dependencies associated with the CWL as a ZIP file or link to an external web location.
2) You cannot directly specify disk size at this time for your workflow. TES does not properly parse out disk information to tell Azure Batch to spawn a VM with a specific local HDD. Therefore, if the customer is running a task that requires significant I/O on intermediate files, we would highly recommend running the workflow in WDL instead, where you can specify the HDD needed.
I'm writing up an FAQ page on my branch today to walk through making a CWL that's Cromwell for Azure compliant for reference and will merge it in with the next release into Master.
-Roberto
That’s awesome! Thanks for the update.
Sent from Mailhttps://go.microsoft.com/fwlink/?LinkId=550986 for Windows 10
From: Roberto Antonio Llerasmailto:notifications@github.com Sent: Saturday, 25 April 2020 4:17 AM To: microsoft/CromwellOnAzuremailto:CromwellOnAzure@noreply.github.com Cc: rick-jimailto:jixin85@outlook.com; Authormailto:author@noreply.github.com Subject: Re: [microsoft/CromwellOnAzure] How to run CWL flows (#65)
Hi Rick,
I have an update for you. I'm happy to report that we've looked into the Cromwell architecture and TES and figured out a working solution for utilizing CWL files with Azure. There's a few key pieces of information:
I'm writing up an FAQ page on my branch today to walk through making a CWL that's Cromwell for Azure compliant for reference and will merge it in with the next release into Master.
-Roberto
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/microsoft/CromwellOnAzure/issues/65#issuecomment-619170267, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ANWZVREGJ56BVBV2F3GRPP3ROHJS7ANCNFSM4MHQ5RCA.
Hi Roberto,
Is that FAQ still WIP? I couldn't see that in the two branches in this repo, or am I missing it?
RIck
Hi Rick,
My apologies. So the FAQ will accompany the next release as we encountered some other things during final testing that required additional code fixes to address to make the functionality more seamless (you’ll still potentially have some issues without the code in the upcoming release). Let me check with the dev team to get an idea on their timeline for the next release. Will update you. shortly.
-Roberto
Roberto Lleras
Senior Applications Scientist | Microsoft Genomics, Microsoft Healthcare NeXT
From: rick-ji notifications@github.com Sent: Wednesday, May 6, 2020 11:05 PM To: microsoft/CromwellOnAzure CromwellOnAzure@noreply.github.com Cc: Roberto Lleras Roberto.Lleras@microsoft.com; Assign assign@noreply.github.com Subject: Re: [microsoft/CromwellOnAzure] How to run CWL flows (#65)
Hi Roberto,
Is that FAQ still WIP? I couldn't see that in the two branches in this repo, or am I missing it?
RIck
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FCromwellOnAzure%2Fissues%2F65%23issuecomment-625047880&data=02%7C01%7CRoberto.Lleras%40microsoft.com%7C8bf5eefeeaec4e46a29808d7f24ca1f2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637244283257511172&sdata=%2FoaHgv%2BA9fCym7Yyz%2FNPOLKQThWjrbspbd94iZ1HqZU%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAOR4GKH3UAF76TBT5ZEPSR3RQJFSJANCNFSM4MHQ5RCA&data=02%7C01%7CRoberto.Lleras%40microsoft.com%7C8bf5eefeeaec4e46a29808d7f24ca1f2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637244283257511172&sdata=vDAa0ajpdDf1%2BRl7EIsOqYGtWFIua5JdrShMGuhUo50%3D&reserved=0.
Hi Rick,
So the support for CWL was added into the main branch a few hours ago as it turns out! I added the updated documentation in my private fork for documentation updates. I’ll issue a PR now for it to be incorporated into the master branch.
In the interim, I’ve added the guidance below:
Running CWL Workflows on Cromwell on Azure Running workflows crafted in the Common Workflow Language(CWL) format is possible with a few modifications to your workflow submission.
Ensure your dependencies are accessible by Cromwell Any additional scripts or subworkflows must be accessible to TES. They can be provided in 3 ways:
Ensure your runtime resource requests are specified with the same names as WDL files CWL files sometimes contain differing runtime parameter names than what's acceptable by TES. Please refer to our guidehttps://github.com/microsoft/CromwellOnAzure/blob/master/docs/managing-your-workflow.md/#how-to-prepare-a-workflow-description-language-wdl-file-that-runs-a-workflow-on-cromwell-on-azure for proper guidance.
Known issue for CWL files: Cannot request specific HDD size Unfortunately, this is actually a bug in how Cromwell currently parses the CWL file- and thus must be addressed in the Cromwell source code directly. We have submitted an issue to the Broad to have this addressed. The current workaround for this is to increase the number of vCPUs or Memory requested for a task, which will indirectly increase the amount of working disk space available. However, because this may cause inconsistent performance, we advise that if you are running a task that might consume a large amount of local scratch space, consider converting your workflow to the WDL format instead.
-Roberto
Roberto Lleras
Senior Applications Scientist | Microsoft Genomics, Microsoft Healthcare NeXT
From: Roberto Lleras Sent: Thursday, May 7, 2020 10:27 AM To: microsoft/CromwellOnAzure reply@reply.github.com; microsoft/CromwellOnAzure CromwellOnAzure@noreply.github.com Cc: Assign assign@noreply.github.com Subject: RE: [microsoft/CromwellOnAzure] How to run CWL flows (#65)
Hi Rick,
My apologies. So the FAQ will accompany the next release as we encountered some other things during final testing that required additional code fixes to address to make the functionality more seamless (you’ll still potentially have some issues without the code in the upcoming release). Let me check with the dev team to get an idea on their timeline for the next release. Will update you. shortly.
-Roberto
Roberto Lleras
Senior Applications Scientist | Microsoft Genomics, Microsoft Healthcare NeXT
From: rick-ji notifications@github.com<mailto:notifications@github.com> Sent: Wednesday, May 6, 2020 11:05 PM To: microsoft/CromwellOnAzure CromwellOnAzure@noreply.github.com<mailto:CromwellOnAzure@noreply.github.com> Cc: Roberto Lleras Roberto.Lleras@microsoft.com<mailto:Roberto.Lleras@microsoft.com>; Assign assign@noreply.github.com<mailto:assign@noreply.github.com> Subject: Re: [microsoft/CromwellOnAzure] How to run CWL flows (#65)
Hi Roberto,
Is that FAQ still WIP? I couldn't see that in the two branches in this repo, or am I missing it?
RIck
— You are receiving this because you were assigned. Reply to this email directly, view it on GitHubhttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmicrosoft%2FCromwellOnAzure%2Fissues%2F65%23issuecomment-625047880&data=02%7C01%7CRoberto.Lleras%40microsoft.com%7C8bf5eefeeaec4e46a29808d7f24ca1f2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637244283257511172&sdata=%2FoaHgv%2BA9fCym7Yyz%2FNPOLKQThWjrbspbd94iZ1HqZU%3D&reserved=0, or unsubscribehttps://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAOR4GKH3UAF76TBT5ZEPSR3RQJFSJANCNFSM4MHQ5RCA&data=02%7C01%7CRoberto.Lleras%40microsoft.com%7C8bf5eefeeaec4e46a29808d7f24ca1f2%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C637244283257511172&sdata=vDAa0ajpdDf1%2BRl7EIsOqYGtWFIua5JdrShMGuhUo50%3D&reserved=0.
For CWL workflows, all CWL resource keywords are supported, plus preemptible (not in CWL spec). Preemptible defaults to true
(set in Cromwell configuration file), so use preemptible only if setting it to false
(run on dedicated machine). TES keywords are also supported in CWL workflows, but we advise users to use the CWL ones.
CWL keywords: (CWL workflows only)
coresMin: number
ramMin: size in MB
tmpdirMin: size in MB
outdirMin: size in MB
(the final disk size is the sum of tmpDir and outDir values)
TES keywords: (both CWL and WDL workflows)
cpu: number
memory: size
unit
disk: size
unit
preemptible: true|false
is there any instruction to run CWL flows?