Barski-lab / cwl-airflow

Python package to extend Airflow functionality with CWL1.1 support
https://barski-lab.github.io/cwl-airflow
Apache License 2.0
185 stars 32 forks source link

BashOperator + CWLDAG #73

Closed kokleong9406 closed 3 years ago

kokleong9406 commented 3 years ago

Hi, I would like to ask does Airflow BashOperator work with CWLDAG? My job will fail, even though I am running a simple echo in BashOperator. Nevertheless, it works well if I am using PythonOperator. Let me know if you need the files to reproduce the errors. Allow me to share a snippet of my dag script:

image

Below is the DAG when viewed in Airflow Web UI:

image

michael-kotliar commented 3 years ago

Hello @kokleong9406

Thanks for using CWL-Airflow. Could you please provide more details about what you are trying to achieve by adding extra steps to CWLDAG? From what I understand from the attached screenshot you just want to add a step that will download the required input files. To solve this problem, I would just create a custom CWLJobDispatcher class and use it in the CWLDAG constructor. From the link below you can see that you can provide your own versions for dispatcher and gatherer steps. https://github.com/Barski-lab/cwl-airflow/blob/63d8d234f2bcc56e8a2d5e6a9b10f72ecf80bf75/cwl_airflow/extensions/cwldag.py#L25 Let me know if you need any help with that.

Michael

kokleong9406 commented 3 years ago

Hi Michael,

Thank you for your suggestion :)

Sorry for the late reply! Actually I just wanted to test whether it is possible to add BashOperator inside the DAG itself to execute bash command. I guess the better approach is to write and wrap the bash command inside the .cwl file itself.

michael-kotliar commented 3 years ago

Hi @kokleong9406,

Yeah, you are right, having everything inside CWL file is a more preferable approach. I will close this issue for now. Feel free to reopen it if you have any other questions.

Michael