apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.59k stars 14.17k forks source link

Handle invalid parameters (including missing object) in webserver #8171

Closed mik-laj closed 2 months ago

mik-laj commented 4 years ago

Description

Hi. The webserver code is very optimistic and negative paths are not checked. For example: We have the following code: https://github.com/apache/airflow/blob/0dafdd0b9d635b4513b1413007337b19c3d96b17/airflow/www/views.py#L595-L597

It is not checked here whether the DAG object exists. Condition dag == None should be added and when it is met, error 404 should be reported.

Use case / motivation

Improving the experience of using the webserver and reducing the number of nukulars.

I hope that a thorough review of the entire web server code and completing the tests with negative paths will improve the overall health of the webserver.

If this is done by the Polidea team, it will be an opportunity to get to know the webserver better. My team has not focused on the webserver yet.

it will be an opportunity to find other health problems (e.g. side-effect, missing tests).

Related Issues

N/A

rcjsuen commented 4 years ago

@mik-laj Hi, what is your expected fix for this issue? Are you looking for something like this?

if dag is None:
    response = jsonify(...)
    response.status_code = 404
    return response

Regarding tests, is a URL like rendered?dag_id=non_existent_dag_id sufficient in tests/www/test_views.py?

mik-laj commented 4 years ago

It should be something similar. The most important thing is that no mushrooms appear, but user-readable error messages.

For example: When you enter following address: http://localhost:28080/tries?dag_id=example_automl_text_sentiment2&days=30 http://localhost:28080/landing_times?dag_id=example_automl_text_sentiment2&days=30 http://localhost:28080/gantt?dag_id=example_automl_text_sentiment2 http://localhost:28080/dag_details?dag_id=example_automl_text_sentiment2 http://localhost:28080/code?dag_id=example_automl_text_sentiment2 you will see error screen similar: Screenshot 2020-04-13 at 17 06 35

However, if you go to the link: http://localhost:28080/tree?dag_id=example_automl_text_cls2 you will see following error message: Screenshot 2020-04-13 at 17 08 08

This should be standardized and a clear message should always be displayed to the user.

mik-laj commented 4 years ago

For clarity. This contribution does not have to solve all problems in one PR. I will be happy even if one problem is solved. And another person will be able to create more changes and solve more problems.

List of all routes ``` Endpoint Methods Rule --------------------------------------- --------- ---------------------------------------------------------------------------------------------- Airflow.blocked POST /blocked Airflow.clear POST /clear Airflow.code GET /code Airflow.dag_details GET /dag_details Airflow.dag_stats POST /dag_stats Airflow.dagrun_clear POST /dagrun_clear Airflow.dagrun_failed POST /dagrun_failed Airflow.dagrun_success POST /dagrun_success Airflow.delete POST /delete Airflow.duration GET /duration Airflow.elasticsearch GET /elasticsearch Airflow.extra_links GET /extra_links Airflow.failed POST /failed Airflow.gantt GET /gantt Airflow.get_logs_with_metadata GET /get_logs_with_metadata Airflow.graph GET /graph Airflow.health GET /health Airflow.index GET /home Airflow.landing_times GET /landing_times Airflow.last_dagruns POST /last_dagruns Airflow.log GET /log Airflow.paused POST /paused Airflow.refresh POST /refresh Airflow.rendered GET /rendered Airflow.run POST /run Airflow.success POST /success Airflow.task GET /task Airflow.task_instances GET /object/task_instances Airflow.task_stats POST /task_stats Airflow.tree GET /tree Airflow.tries GET /tries Airflow.trigger GET, POST /trigger Airflow.xcom GET /xcom AuthDBView.login GET, POST /login/ AuthDBView.logout GET /logout/ ConfigurationView.conf GET /configuration ConnectionModelView.action GET, POST /connection/action// ConnectionModelView.action_post POST /connection/action_post ConnectionModelView.add GET, POST /connection/add ConnectionModelView.api GET /connection/api ConnectionModelView.api_column_add GET /connection/api/column/add/ ConnectionModelView.api_column_edit GET /connection/api/column/edit/ ConnectionModelView.api_create POST /connection/api/create ConnectionModelView.api_delete DELETE /connection/api/delete/ ConnectionModelView.api_get GET /connection/api/get/ ConnectionModelView.api_read GET /connection/api/read ConnectionModelView.api_readvalues GET /connection/api/readvalues ConnectionModelView.api_update PUT /connection/api/update/ ConnectionModelView.delete GET, POST /connection/delete/ ConnectionModelView.download GET /connection/download/ ConnectionModelView.edit GET, POST /connection/edit/ ConnectionModelView.list GET /connection/list/ ConnectionModelView.show GET /connection/show/ DagModelView.action GET, POST /dagmodel/action// DagModelView.action_post POST /dagmodel/action_post DagModelView.add GET, POST /dagmodel/add DagModelView.api GET /dagmodel/api DagModelView.api_column_add GET /dagmodel/api/column/add/ DagModelView.api_column_edit GET /dagmodel/api/column/edit/ DagModelView.api_create POST /dagmodel/api/create DagModelView.api_delete DELETE /dagmodel/api/delete/ DagModelView.api_get GET /dagmodel/api/get/ DagModelView.api_read GET /dagmodel/api/read DagModelView.api_readvalues GET /dagmodel/api/readvalues DagModelView.api_update PUT /dagmodel/api/update/ DagModelView.autocomplete GET /dagmodel/autocomplete DagModelView.delete GET, POST /dagmodel/delete/ DagModelView.download GET /dagmodel/download/ DagModelView.edit GET, POST /dagmodel/edit/ DagModelView.list GET /dagmodel/list/ DagModelView.show GET /dagmodel/show/ DagRunModelView.action GET, POST /dagrun/action// DagRunModelView.action_post POST /dagrun/action_post DagRunModelView.add GET, POST /dagrun/add DagRunModelView.api GET /dagrun/api DagRunModelView.api_column_add GET /dagrun/api/column/add/ DagRunModelView.api_column_edit GET /dagrun/api/column/edit/ DagRunModelView.api_create POST /dagrun/api/create DagRunModelView.api_delete DELETE /dagrun/api/delete/ DagRunModelView.api_get GET /dagrun/api/get/ DagRunModelView.api_read GET /dagrun/api/read DagRunModelView.api_readvalues GET /dagrun/api/readvalues DagRunModelView.api_update PUT /dagrun/api/update/ DagRunModelView.delete GET, POST /dagrun/delete/ DagRunModelView.download GET /dagrun/download/ DagRunModelView.edit GET, POST /dagrun/edit/ DagRunModelView.list GET /dagrun/list/ DagRunModelView.show GET /dagrun/show/ IndexView.index GET / JobModelView.action GET, POST /job/action// JobModelView.action_post POST /job/action_post JobModelView.add GET, POST /job/add JobModelView.api GET /job/api JobModelView.api_column_add GET /job/api/column/add/ JobModelView.api_column_edit GET /job/api/column/edit/ JobModelView.api_create POST /job/api/create JobModelView.api_delete DELETE /job/api/delete/ JobModelView.api_get GET /job/api/get/ JobModelView.api_read GET /job/api/read JobModelView.api_readvalues GET /job/api/readvalues JobModelView.api_update PUT /job/api/update/ JobModelView.delete GET, POST /job/delete/ JobModelView.download GET /job/download/ JobModelView.edit GET, POST /job/edit/ JobModelView.list GET /job/list/ JobModelView.show GET /job/show/ LocaleView.index GET /lang/ LogModelView.action GET, POST /log/action// LogModelView.action_post POST /log/action_post LogModelView.add GET, POST /log/add LogModelView.api GET /log/api LogModelView.api_column_add GET /log/api/column/add/ LogModelView.api_column_edit GET /log/api/column/edit/ LogModelView.api_create POST /log/api/create LogModelView.api_delete DELETE /log/api/delete/ LogModelView.api_get GET /log/api/get/ LogModelView.api_read GET /log/api/read LogModelView.api_readvalues GET /log/api/readvalues LogModelView.api_update PUT /log/api/update/ LogModelView.delete GET, POST /log/delete/ LogModelView.download GET /log/download/ LogModelView.edit GET, POST /log/edit/ LogModelView.list GET /log/list/ LogModelView.show GET /log/show/ MenuApi.get_menu_data GET /api/v1/menu/ PermissionModelView.action GET, POST /permissions/action// PermissionModelView.action_post POST /permissions/action_post PermissionModelView.add GET, POST /permissions/add PermissionModelView.api GET /permissions/api PermissionModelView.api_column_add GET /permissions/api/column/add/ PermissionModelView.api_column_edit GET /permissions/api/column/edit/ PermissionModelView.api_create POST /permissions/api/create PermissionModelView.api_delete DELETE /permissions/api/delete/ PermissionModelView.api_get GET /permissions/api/get/ PermissionModelView.api_read GET /permissions/api/read PermissionModelView.api_readvalues GET /permissions/api/readvalues PermissionModelView.api_update PUT /permissions/api/update/ PermissionModelView.delete GET, POST /permissions/delete/ PermissionModelView.download GET /permissions/download/ PermissionModelView.edit GET, POST /permissions/edit/ PermissionModelView.list GET /permissions/list/ PermissionModelView.show GET /permissions/show/ PermissionViewModelView.action GET, POST /permissionviews/action// PermissionViewModelView.action_post POST /permissionviews/action_post PermissionViewModelView.add GET, POST /permissionviews/add PermissionViewModelView.api GET /permissionviews/api PermissionViewModelView.api_column_add GET /permissionviews/api/column/add/ PermissionViewModelView.api_column_edit GET /permissionviews/api/column/edit/ PermissionViewModelView.api_create POST /permissionviews/api/create PermissionViewModelView.api_delete DELETE /permissionviews/api/delete/ PermissionViewModelView.api_get GET /permissionviews/api/get/ PermissionViewModelView.api_read GET /permissionviews/api/read PermissionViewModelView.api_readvalues GET /permissionviews/api/readvalues PermissionViewModelView.api_update PUT /permissionviews/api/update/ PermissionViewModelView.delete GET, POST /permissionviews/delete/ PermissionViewModelView.download GET /permissionviews/download/ PermissionViewModelView.edit GET, POST /permissionviews/edit/ PermissionViewModelView.list GET /permissionviews/list/ PermissionViewModelView.show GET /permissionviews/show/ PoolModelView.action GET, POST /pool/action// PoolModelView.action_post POST /pool/action_post PoolModelView.add GET, POST /pool/add PoolModelView.api GET /pool/api PoolModelView.api_column_add GET /pool/api/column/add/ PoolModelView.api_column_edit GET /pool/api/column/edit/ PoolModelView.api_create POST /pool/api/create PoolModelView.api_delete DELETE /pool/api/delete/ PoolModelView.api_get GET /pool/api/get/ PoolModelView.api_read GET /pool/api/read PoolModelView.api_readvalues GET /pool/api/readvalues PoolModelView.api_update PUT /pool/api/update/ PoolModelView.delete GET, POST /pool/delete/ PoolModelView.download GET /pool/download/ PoolModelView.edit GET, POST /pool/edit/ PoolModelView.list GET /pool/list/ PoolModelView.show GET /pool/show/ ResetMyPasswordView.this_form_get GET /resetmypassword/form ResetMyPasswordView.this_form_post POST /resetmypassword/form ResetPasswordView.this_form_get GET /resetpassword/form ResetPasswordView.this_form_post POST /resetpassword/form RoleModelView.action GET, POST /roles/action// RoleModelView.action_post POST /roles/action_post RoleModelView.add GET, POST /roles/add RoleModelView.api GET /roles/api RoleModelView.api_column_add GET /roles/api/column/add/ RoleModelView.api_column_edit GET /roles/api/column/edit/ RoleModelView.api_create POST /roles/api/create RoleModelView.api_delete DELETE /roles/api/delete/ RoleModelView.api_get GET /roles/api/get/ RoleModelView.api_read GET /roles/api/read RoleModelView.api_readvalues GET /roles/api/readvalues RoleModelView.api_update PUT /roles/api/update/ RoleModelView.delete GET, POST /roles/delete/ RoleModelView.download GET /roles/download/ RoleModelView.edit GET, POST /roles/edit/ RoleModelView.list GET /roles/list/ RoleModelView.show GET /roles/show/ SecurityApi.login POST /api/v1/security/login SecurityApi.refresh POST /api/v1/security/refresh SlaMissModelView.action GET, POST /slamiss/action// SlaMissModelView.action_post POST /slamiss/action_post SlaMissModelView.add GET, POST /slamiss/add SlaMissModelView.api GET /slamiss/api SlaMissModelView.api_column_add GET /slamiss/api/column/add/ SlaMissModelView.api_column_edit GET /slamiss/api/column/edit/ SlaMissModelView.api_create POST /slamiss/api/create SlaMissModelView.api_delete DELETE /slamiss/api/delete/ SlaMissModelView.api_get GET /slamiss/api/get/ SlaMissModelView.api_read GET /slamiss/api/read SlaMissModelView.api_readvalues GET /slamiss/api/readvalues SlaMissModelView.api_update PUT /slamiss/api/update/ SlaMissModelView.delete GET, POST /slamiss/delete/ SlaMissModelView.download GET /slamiss/download/ SlaMissModelView.edit GET, POST /slamiss/edit/ SlaMissModelView.list GET /slamiss/list/ SlaMissModelView.show GET /slamiss/show/ TaskInstanceModelView.action GET, POST /taskinstance/action// TaskInstanceModelView.action_post POST /taskinstance/action_post TaskInstanceModelView.add GET, POST /taskinstance/add TaskInstanceModelView.api GET /taskinstance/api TaskInstanceModelView.api_column_add GET /taskinstance/api/column/add/ TaskInstanceModelView.api_column_edit GET /taskinstance/api/column/edit/ TaskInstanceModelView.api_create POST /taskinstance/api/create TaskInstanceModelView.api_delete DELETE /taskinstance/api/delete/ TaskInstanceModelView.api_get GET /taskinstance/api/get/ TaskInstanceModelView.api_read GET /taskinstance/api/read TaskInstanceModelView.api_readvalues GET /taskinstance/api/readvalues TaskInstanceModelView.api_update PUT /taskinstance/api/update/ TaskInstanceModelView.delete GET, POST /taskinstance/delete/ TaskInstanceModelView.download GET /taskinstance/download/ TaskInstanceModelView.edit GET, POST /taskinstance/edit/ TaskInstanceModelView.list GET /taskinstance/list/ TaskInstanceModelView.show GET /taskinstance/show/ UserDBModelView.action GET, POST /users/action// UserDBModelView.action_post POST /users/action_post UserDBModelView.add GET, POST /users/add UserDBModelView.api GET /users/api UserDBModelView.api_column_add GET /users/api/column/add/ UserDBModelView.api_column_edit GET /users/api/column/edit/ UserDBModelView.api_create POST /users/api/create UserDBModelView.api_delete DELETE /users/api/delete/ UserDBModelView.api_get GET /users/api/get/ UserDBModelView.api_read GET /users/api/read UserDBModelView.api_readvalues GET /users/api/readvalues UserDBModelView.api_update PUT /users/api/update/ UserDBModelView.delete GET, POST /users/delete/ UserDBModelView.download GET /users/download/ UserDBModelView.edit GET, POST /users/edit/ UserDBModelView.list GET /users/list/ UserDBModelView.show GET /users/show/ UserDBModelView.userinfo GET /users/userinfo/ UserInfoEditView.this_form_get GET /userinfoeditview/form UserInfoEditView.this_form_post POST /userinfoeditview/form UserStatsChartView.chart GET /userstatschartview/chart/ UserStatsChartView.chart GET /userstatschartview/chart/ UtilView.back GET /back VariableModelView.action GET, POST /variable/action// VariableModelView.action_post POST /variable/action_post VariableModelView.add GET, POST /variable/add VariableModelView.api GET /variable/api VariableModelView.api_column_add GET /variable/api/column/add/ VariableModelView.api_column_edit GET /variable/api/column/edit/ VariableModelView.api_create POST /variable/api/create VariableModelView.api_delete DELETE /variable/api/delete/ VariableModelView.api_get GET /variable/api/get/ VariableModelView.api_read GET /variable/api/read VariableModelView.api_readvalues GET /variable/api/readvalues VariableModelView.api_update PUT /variable/api/update/ VariableModelView.delete GET, POST /variable/delete/ VariableModelView.download GET /variable/download/ VariableModelView.edit GET, POST /variable/edit/ VariableModelView.list GET /variable/list/ VariableModelView.show GET /variable/show/ VariableModelView.varimport POST /variable/varimport VersionView.version GET /version ViewMenuModelView.action GET, POST /viewmenus/action// ViewMenuModelView.action_post POST /viewmenus/action_post ViewMenuModelView.add GET, POST /viewmenus/add ViewMenuModelView.api GET /viewmenus/api ViewMenuModelView.api_column_add GET /viewmenus/api/column/add/ ViewMenuModelView.api_column_edit GET /viewmenus/api/column/edit/ ViewMenuModelView.api_create POST /viewmenus/api/create ViewMenuModelView.api_delete DELETE /viewmenus/api/delete/ ViewMenuModelView.api_get GET /viewmenus/api/get/ ViewMenuModelView.api_read GET /viewmenus/api/read ViewMenuModelView.api_readvalues GET /viewmenus/api/readvalues ViewMenuModelView.api_update PUT /viewmenus/api/update/ ViewMenuModelView.delete GET, POST /viewmenus/delete/ ViewMenuModelView.download GET /viewmenus/download/ ViewMenuModelView.edit GET, POST /viewmenus/edit/ ViewMenuModelView.list GET /viewmenus/list/ ViewMenuModelView.show GET /viewmenus/show/ XComModelView.action GET, POST /xcom/action// XComModelView.action_post POST /xcom/action_post XComModelView.add GET, POST /xcom/add XComModelView.api GET /xcom/api XComModelView.api_column_add GET /xcom/api/column/add/ XComModelView.api_column_edit GET /xcom/api/column/edit/ XComModelView.api_create POST /xcom/api/create XComModelView.api_delete DELETE /xcom/api/delete/ XComModelView.api_get GET /xcom/api/get/ XComModelView.api_read GET /xcom/api/read XComModelView.api_readvalues GET /xcom/api/readvalues XComModelView.api_update PUT /xcom/api/update/ XComModelView.delete GET, POST /xcom/delete/ XComModelView.download GET /xcom/download/ XComModelView.edit GET, POST /xcom/edit/ XComModelView.list GET /xcom/list/ XComModelView.show GET /xcom/show/ api_experimental.create_pool POST /api/experimental/pools api_experimental.dag_is_paused GET /api/experimental/dags//paused api_experimental.dag_paused GET /api/experimental/dags//paused/ api_experimental.dag_run_status GET /api/experimental/dags//dag_runs/ api_experimental.dag_runs GET /api/experimental/dags//dag_runs api_experimental.delete_dag DELETE /api/experimental/dags/ api_experimental.delete_pool DELETE /api/experimental/pools/ api_experimental.get_dag_code GET /api/experimental/dags//code api_experimental.get_lineage GET /api/experimental/lineage// api_experimental.get_pool GET /api/experimental/pools/ api_experimental.get_pools GET /api/experimental/pools api_experimental.info GET /api/experimental/info api_experimental.latest_dag_runs GET /api/experimental/latest_runs api_experimental.task_info GET /api/experimental/dags//tasks/ api_experimental.task_instance_info GET /api/experimental/dags//dag_runs//tasks/ api_experimental.test GET /api/experimental/test api_experimental.trigger_dag POST /api/experimental/dags//dag_runs appbuilder.static GET /static/appbuilder/ routes.index GET / static GET /static/ ```
rcjsuen commented 4 years ago

@mik-laj I have created #8279 which addresses a single endpoint for a start. Please let me know what you think.

mik-laj commented 4 years ago

This is a user experience problem, but it is also a security problem. If we see similar messages, it means that we haven't verified enough input data. Data validation is the basic method of protecting against other serious attacks from the "Injection" family e.g. SQL Injection. Input validation should happen as early as possible in the data flow, preferably as soon as the data is received from the client. However, we do not have any validation for many parameters. image More information: https://cheatsheetseries.owasp.org/cheatsheets/Input_Validation_Cheat_Sheet.html

2796gaurav commented 4 years ago

Hi @mik-laj As discussed let me put my hands on this. Let me know if there's something you would want to provide additional info.

mik-laj commented 4 years ago

@2796gaurav We already have one draft. Please read it and comments, and then you will be able to start working on this change.

2796gaurav commented 4 years ago

Yes sure!

mik-laj commented 4 years ago

I also invite you to read our guides: https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst https://github.com/apache/airflow/blob/master/BREEZE.rst https://github.com/apache/airflow/blob/master/LOCAL_VIRTUALENV.rst https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst https://github.com/apache/airflow/blob/master/TESTING.rst There is a lot of information about our work environment and community

mik-laj commented 3 years ago

@2796gaurav I unassigned you from this ticket so that the next contributor can start working. If you want to continue working, let me know and I will assign you again.

uranusjr commented 3 years ago

I wonder if type checks can be used to catch these kinds of issues. There are a lot of current_app.dag_bag, current_app.appbuilder, etc., which are not covered by Mypy due to the dynamic nature of flask.current_app (basically any attributes on it is Any). Since we “know” what attributes to expect on the current_app instance within Airflow, maybe we can introduce a typing shim around it? Something like

# airflow/www/app.py

if TYPE_CHECKING:
    from airflow.models.dagbag import DagBag
    from airflow.www.security import AirflowSecurityManager

    class _AirflowAppBuilder(Protocol):
        sm: AirflowSecurityManager
        ...

    class _CurrentApp(Protocol):
        dag_bag: DagBag
        appbuilder: _AirflowAppBuilder
        ...

from flask import current_app as _current_app

current_app = cast("_CurrentApp", _current_app)

And then all code can import this instead of directly from Flask to be type checked.

We can alternatively supply a type stub flask.pyi to “lie to” Mypy flask.current_app is _CurrentApp.

StylusEater commented 4 months ago

This issue seems to be resolved in the latest version. The following links, given in the original issue, return a reasonable error and valid page without going "nuclear":

http://localhost:28080/tries?dag_id=example_automl_text_sentiment2&days=30 http://localhost:28080/landing_times?dag_id=example_automl_text_sentiment2&days=30 http://localhost:28080/gantt?dag_id=example_automl_text_sentiment2 http://localhost:28080/dag_details?dag_id=example_automl_text_sentiment2 http://localhost:28080/code?dag_id=example_automl_text_sentiment2