Barski-lab / cwl-airflow

Python package to extend Airflow functionality with CWL1.1 support
https://barski-lab.github.io/cwl-airflow
Apache License 2.0
183 stars 33 forks source link

Running CWL-Airflow with docker-compose does not work #67

Closed superbsky closed 2 years ago

superbsky commented 3 years ago

Describe the bug Running CWL-Airflow with docker-compose does not work w/

webserver | mysql: [Warning] Using a password on the command line interface can be insecure. webserver | ERROR 1146 (42S02) at line 1: Table 'airflow.dag_run' doesn't exist webserver | Sleep 1 sec apiserver | mysql: [Warning] Using a password on the command line interface can be insecure. apiserver | ERROR 1146 (42S02) at line 1: Table 'airflow.dag_run' doesn't exist apiserver | Sleep 1 sec

To Reproduce Steps to reproduce the behavior:

  1. git clone https://github.com/Barski-lab/cwl-airflow.git
  2. cd cwl-airflow/packaging/docker_compose/local_executor
  3. docker-compose up --build

Expected behavior cwl-airflow docker-compose stack up and running with all services in check

Desktop (please complete the following information): Ubuntu 18

Additional context Full back to this solution https://morioh.com/p/3531d754cab7 but it also has some issues with job submission

michael-kotliar commented 3 years ago

Hello @superbsky,

Thank you for using CWL-Airflow. The log you provided indicates, that webserver and apiserver are waiting until cwl-airflow finishes creating all required databases (running airflow init). It shouldn't take a long time, however, if the setting in your .env are not valid, that can be the reason why you can't start docker-compose. Let me know if it helps, or if you have any other questions.

superbsky commented 3 years ago

I have fresh airflow install in $AIRFLOW_HOME and .env file updated accordingly. I waited for a few minutes, but init is not happening.

Also, I noticed that docker-compose causing some strange permission settings:

ls -la /temp/airflow_standalone/mysql_data
total 176200
drwxr-xr-x 6  999 root     4096 Jul 27 15:18 .
drwxrwxr-x 8 genx genx     4096 Jul 27 15:16 ..
drwxr-x--- 2  999  999     4096 Jul 27 15:17 airflow
-rw-r----- 1  999  999       56 Jul 27 15:16 auto.cnf
-rw------- 1  999  999     1680 Jul 27 15:16 ca-key.pem
-rw-r--r-- 1  999  999     1112 Jul 27 15:16 ca.pem
-rw-r--r-- 1  999  999     1112 Jul 27 15:16 client-cert.pem
-rw------- 1  999  999     1680 Jul 27 15:16 client-key.pem
-rw-r----- 1  999  999      667 Jul 27 15:18 ib_buffer_pool
-rw-r----- 1  999  999 50331648 Jul 27 15:18 ib_logfile0
-rw-r----- 1  999  999 50331648 Jul 27 15:16 ib_logfile1
-rw-r----- 1  999  999 79691776 Jul 27 15:18 ibdata1
drwxr-x--- 2  999  999     4096 Jul 27 15:16 mysql
drwxr-x--- 2  999  999     4096 Jul 27 15:16 performance_schema
-rw------- 1  999  999     1680 Jul 27 15:16 private_key.pem
-rw-r--r-- 1  999  999      452 Jul 27 15:16 public_key.pem
-rw-r--r-- 1  999  999     1112 Jul 27 15:16 server-cert.pem
-rw------- 1  999  999     1676 Jul 27 15:16 server-key.pem
drwxr-x--- 2  999  999    12288 Jul 27 15:17 sys

I just tried a fresh airflow installation using https://airflow.apache.org/docs/apache-airflow/stable/start/local.html, but w/o airflow db init and it is still failing with missing table :(

May I missing something from the documentation? https://cwl-airflow.readthedocs.io/en/latest/readme/how_to_use.html#running-cwl-airflow-with-docker-compose

What are the exact steps to set up cwl-airflow w/ airflow as a standalone? as a docker? as a combination?

superbsky commented 3 years ago

@michael-kotliar, any suggestions?

I am also getting some wired errors like:

scheduler    | Traceback (most recent call last):
scheduler    |   File "/usr/local/bin/cwl-airflow", line 15, in <module>
scheduler    |     sys.exit(main(sys.argv[1:]))
scheduler    |   File "/usr/local/bin/cwl-airflow", line 9, in main
scheduler    |     args = parse_arguments(argsl)
scheduler    |   File "/usr/local/lib/python3.8/dist-packages/cwl_airflow/utilities/parser.py", line 281, in parse_arguments
scheduler    |     args, _ = get_parser().parse_known_args(argsl)
scheduler    |   File "/usr/local/lib/python3.8/dist-packages/cwl_airflow/utilities/parser.py", line 66, in get_parser
scheduler    |     version=get_version(),
scheduler    |   File "/usr/local/lib/python3.8/dist-packages/cwl_airflow/utilities/helpers.py", line 211, in get_version
scheduler    |     pkg = pkg_resources.require("cwl_airflow")
scheduler    |   File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 901, in require
scheduler    |     needed = self.resolve(parse_requirements(requirements))
scheduler    |   File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 792, in resolve
scheduler    |     raise VersionConflict(dist, req).with_context(dependent_req)
scheduler    | pkg_resources.ContextualVersionConflict: (apache-airflow 2.0.1 (/usr/local/lib/python3.8/dist-packages), Requirement.parse('apache-airflow>=2.1.0'), {'apache-airflow-providers-http'})

By which I am very confused because I thought that cwl-airflow uses an installed version of the airflow...

What am I missing?

michael-kotliar commented 2 years ago

Hi @superbsky,

I've fixed bugs in the docker-compose file in the latest CWL-Airflow 1.2.11. Input parameters can be set in .env file. Additionally, you may take a look at the run_conformance_tests.sh file where I run our testing workflows using docker-compose.

Thanks for using CWL-Airflow

michael-kotliar commented 2 years ago

Feel free to reopen this issue if the problem still exists.