Barski-lab / cwl-airflow

Python package to extend Airflow functionality with CWL1.1 support
https://barski-lab.github.io/cwl-airflow
Apache License 2.0
185 stars 32 forks source link

Schedule interval workflow #11

Closed Huiziryuu closed 6 years ago

Huiziryuu commented 6 years ago

Hi, @michael-kotliar

How could I rerun the cwl jobs? For example I ran it once, adjusted the input data file (not input json/yaml file), wanna run the jobs again. I think it is an usual case.

Meanwhile I was studying your code in order to get familiar how cwl-airflow works, I found out that it's hard-coded to run cwl jobs 'once'. Is it in your road map to support schedule cwl jobs run in interval time schedule, like run every 1 hour and so on.

br

michael-kotliar commented 6 years ago

@Huiziryuu Thank you for your question. We use cwl-airflow to analyze data from biological experiments. Usually, you want to rerun some experiment only in a case when some input parameter was not correct. For this, you definitely need to generate new job file. Because for each job file we internally generate new DAG, the situation when we need to rerun exactly the same combination of job and cwl files (the same DAG) is almost impossible. That's why we hardcoded once. In future, we are planning to make it more flexible, but for now, the decision was made based on the needs of our lab.

michael-kotliar commented 6 years ago

As of v1.0.13 if the same Job file (with the updated input parameters) is submitted twice using cwl-airflow submit command, it will result in adding the new DAG with the different DAG id. Original DAG will be saved too.