puckel / docker-airflow

Docker Apache Airflow
Apache License 2.0
3.77k stars 542 forks source link

Run airflow commands with docker-compose exec instead of run (without installing requirements.txt) #505

Open nateGeorge opened 4 years ago

nateGeorge commented 4 years ago

Is it possible to run airflow CLI commands using docker-compose without reinstalling requirements.txt? Right now to run the airflow CLI it's:

docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow list_dags

but this will reinstall from the requirements.txt file as it spins up a new container. Is there some way to run airflow CLI commands with exec? When I do:

docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow list_dags

I get the error:

[2020-02-24 00:19:16,092] {{cli_action_loggers.py:107}} WARNING - Failed to log action with (sqlite3.OperationalError) no such table: log
[SQL: INSERT INTO log (dttm, dag_id, task_id, event, execution_date, owner, extra) VALUES (?, ?, ?, ?, ?, ?, ?)]
[parameters: ('2020-02-24 00:19:16.089257', None, None, 'cli_list_dags', None, 'airflow', '{"host_name": "ab1604e1e47d", "full_command": "[\'/usr/local/bin/airflow\', \'list_dags\']"}')]
(Background on this error at: http://sqlalche.me/e/e3q8)
[2020-02-24 00:19:16,092] {{__init__.py:51}} INFO - Using executor SequentialExecutor
[2020-02-24 00:19:16,093] {{dagbag.py:403}} INFO - Filling up the DagBag from /usr/local/airflow/dags
/usr/local/lib/python3.7/site-packages/airflow/utils/helpers.py:439: DeprecationWarning: Importing 'PythonOperator' directly from 'airflow.operators' has been deprecated. Please import from 'airflow.operators.[operator_module]' instead. Support for direct imports will be dropped entirely in Airflow 2.0.
  DeprecationWarning)
[2020-02-24 00:19:16,107] {{dagbag.py:246}} ERROR - Failed to import: /usr/local/lib/python3.7/site-packages/airflow/example_dags/example_subdag_operator.py
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
    cursor, statement, parameters, context
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
    cursor.execute(statement, parameters)
sqlite3.OperationalError: no such table: slot_pool

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/models/dagbag.py", line 243, in process_file
    m = imp.load_source(mod_name, filepath)
  File "/usr/local/lib/python3.7/imp.py", line 171, in load_source
    module = _load(spec)
  File "<frozen importlib._bootstrap>", line 696, in _load
  File "<frozen importlib._bootstrap>", line 677, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 728, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/usr/local/lib/python3.7/site-packages/airflow/example_dags/example_subdag_operator.py", line 50, in <module>
    dag=dag,
  File "/usr/local/lib/python3.7/site-packages/airflow/utils/db.py", line 74, in wrapper
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/airflow/utils/decorators.py", line 98, in wrapper
    result = func(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/airflow/operators/subdag_operator.py", line 77, in __init__
    .filter(Pool.pool == self.pool)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3287, in first
    ret = list(self[0:1])
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3065, in __getitem__
    return list(res)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3389, in __iter__
    return self._execute_and_instances(context)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3414, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 982, in execute
    return meth(self, multiparams, params)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 293, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1101, in _execute_clauseelement
    distilled_params,
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1250, in _execute_context
    e, statement, parameters, cursor, context
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1476, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1246, in _execute_context
    cursor, statement, parameters, context
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 588, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (sqlite3.OperationalError) no such table: slot_pool
[SQL: SELECT slot_pool.id AS slot_pool_id, slot_pool.pool AS slot_pool_pool, slot_pool.slots AS slot_pool_slots, slot_pool.description AS slot_pool_description 
FROM slot_pool 
WHERE slot_pool.slots = ? AND slot_pool.pool = ?
 LIMIT ? OFFSET ?]
[parameters: (1, 'default_pool', 1, 0)]
(Background on this error at: http://sqlalche.me/e/e3q8)

-------------------------------------------------------------------
DAGS
-------------------------------------------------------------------
create_all_tables
example_bash_operator
example_branch_dop_operator_v3
example_branch_operator
example_complex
example_external_task_marker_child
example_external_task_marker_parent
example_http_operator
example_passing_params_via_test_command
example_pig_operator
example_python_operator
example_short_circuit_operator
example_skip_dag
example_trigger_controller_dag
example_trigger_target_dag
example_xcom
latest_only
latest_only_with_trigger
sparkify_etl
test_utils
tutorial

But it works fine with run. I'm pretty confused as to why exec won't work and run does. Is it possible to run airflow CLI commands with exec?

charlesNumerator commented 4 years ago

i have this same question

dvainrub commented 4 years ago

I'm having the same issue, to be more specific:

Commands like these work great:

> docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow list_dags
> docker-compose -f docker-compose-CeleryExecutor.yml run --rm webserver airflow connections --list

But if we exec the docker container, and run the same commands, it will throw an error:

> docker exec -it <container ID from 'docker ps'> /bin/bash
> airflow connections --list

The same error will happen if we exec after running docker-compose -f docker-compose-CeleryExecutor.yml up -d

An interesting thing happens if we run the following commands after the docker exec above:

> airflow upgradedb # or 'airflow initdb'
> airflow connections --list

In this scenario we don't get any error, but the DB used is no longer the same...

So it seems as if the connection to the PostgresDB is lost after running docker exec, no idea why...

Did anyone fix this?

dvainrub commented 4 years ago

So the solution was suggested at this Slack thread: https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1585679850223300

After running docker-compose we need to run:

> docker exec -it <container id> /entrypoint.sh bash

And all commands will work

Thanks to Jed Cunningham for the solution

ashkan-leo commented 4 years ago

The workaround @dvainrub mentioned solved my problem.