Barski-lab / cwl-airflow

Python package to extend Airflow functionality with CWL1.1 support
https://barski-lab.github.io/cwl-airflow
Apache License 2.0
185 stars 32 forks source link

CWL-DAG Schedule bug #57

Closed seongwoo-jang7 closed 3 years ago

seongwoo-jang7 commented 3 years ago

Describe the bug HI, I want to schedule using , CWL-airflow, but I don't think I can schedule it. Even though DAG included schedule_interval as a parameter, it appears as scheduling none in the Airflow Web UI. Is CWL Airflow not available for scheduling? Or did I do something wrong? Is it a bug? Please, help me. thanks!

Desktop (please complete the following information):

michael-kotliar commented 3 years ago

Hi @seongwoo-jang7 , Thank you for submitting an issue. In the current implementation of CWL-Airflow we intentionally set "schedule_interval": None as we envision CWLDAGs to be externally triggered. Could you please provide an explanation of your use case, so we can find a proper solution?

https://github.com/Barski-lab/cwl-airflow/blob/8aa26f6a50a8ea15bb8d544225f2dfe09668a5f3/cwl_airflow/extensions/cwldag.py#L106

seongwoo-jang7 commented 3 years ago

Hi, @michael-kotliar Thanks for your coments.

I am using CWL-airflow for the purpose of proceeding with ETL work. So I asked because I needed scheduling. I understand your intentions and can you make to able scheduling? can schedule it, and the external trigger.

And i have an additional question, how can I use the airflow operator? For example, I want to use a big query operator among airflow operators to put data in the middle of the cwl step. Thanks!

michael-kotliar commented 3 years ago

Hi, @seongwoo-jang7, sorry for the late response.

If you install CWL-Airflow from the latest commit on the master branch, you will be able to provide your own schedule_interval in CWLDAG constructor. Let me know if something doesn't work as expected.

As for the calling big query operator, in between CWL workflow steps execution, it's currently not that easy to implement, as it will change the structure of DAG. Our concept was to load DAG from CWL file so it completely depends on it. However, there is a solution. Have you considered using ExternalTaskSensor?

seongwoo-jang7 commented 3 years ago

I've considered the External Sensor, but it is not i want. I'll try schedule_interval again and open the issue if it doesn't work. Thank you for your comments!