apache / airflow

Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
https://airflow.apache.org/
Apache License 2.0
36.19k stars 14.05k forks source link

Airflow 2.6.0 scheduler not starting #31353

Closed nitra-1 closed 1 year ago

nitra-1 commented 1 year ago

Apache Airflow version

Other Airflow 2 version (please specify below)

What happened

When i start airflow scheduler it fails with error, error log is below $ airflow scheduler


__ |( )__ / /____ __ /| | / / /_ _ / _ | /| / / | / / / / / // / |/ |/ / // |// // // // __/__/|/ [2023-05-17T18:15:39.249+0530] {executor_loader.py:114} INFO - Loaded executor: LocalExecutor [2023-05-17 18:15:39 +0530] [131398] [INFO] Starting gunicorn 20.1.0

[2023-05-17 18:15:39 +0530] [131398] [INFO] Using worker: sync [2023-05-17 18:15:39 +0530] [131399] [INFO] Booting worker with pid: 131399 [2023-05-17T18:15:39.271+0530] {scheduler_job_runner.py:823} INFO - Starting the scheduler [2023-05-17T18:15:39.271+0530] {scheduler_job_runner.py:830} INFO - Processing each file at most -1 times [2023-05-17 18:15:39 +0530] [131477] [INFO] Booting worker with pid: 131477 Exception in thread Thread-1: Traceback (most recent call last): File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/usr/lib/python3.8/threading.py", line 870, in run self._target(*self._args, **self._kwargs) File "/usr/lib/python3.8/multiprocessing/managers.py", line 192, in accepter t.start() File "/usr/lib/python3.8/threading.py", line 852, in start _start_new_thread(self._bootstrap, ()) RuntimeError: can't start new thread [2023-05-17T18:15:39.334+0530] {scheduler_job_runner.py:887} ERROR - Exception when executing SchedulerJob._run_scheduler_loop Traceback (most recent call last): File "/home/webadmin/.local/lib/python3.8/site-packages/airflow/jobs/scheduler_job_runner.py", line 861, in _execute self.job.executor.start() File "/home/webadmin/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 366, in start self.impl.start() File "/home/webadmin/.local/lib/python3.8/site-packages/airflow/executors/local_executor.py", line 307, in start worker.start() File "/usr/lib/python3.8/multiprocessing/process.py", line 121, in start self._popen = self._Popen(self) File "/usr/lib/python3.8/multiprocessing/context.py", line 224, in _Popen return _default_context.get_context().Process._Popen(process_obj) File "/usr/lib/python3.8/multiprocessing/context.py", line 277, in _Popen return Popen(process_obj) File "/usr/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init self._launch(process_obj) File "/usr/lib/python3.8/multiprocessing/popen_fork.py", line 70, in _launch self.pid = os.fork() OSError: [Errno 12] Cannot allocate memory [2023-05-17T18:15:39.337+0530] {local_executor.py:399} INFO - Shutting down LocalExecutor; waiting for running tasks to finish. Signal again if you don't want to wait.

What you think should happen instead

Schedule should start clean

How to reproduce

start airflow scheduler with command >> airflow scheduler

Operating System

NAME="Ubuntu" VERSION="Ubuntu 20.04.6 LTS"

Versions of Apache Airflow Providers

2.6.0

Deployment

Other

Deployment details

Package Version


adal 1.2.7 aiobotocore 2.5.0 aiofiles 23.1.0 aiohttp 3.8.4 aioitertools 0.11.0 aiosignal 1.3.1 airflow-code-editor 7.2.1 alembic 1.10.4 aliyun-python-sdk-core 2.13.36 aliyun-python-sdk-kms 2.16.0 amqp 5.1.1 analytics-python 1.4.post1 ansiwrap 0.8.4 anyio 3.6.2 apache-airflow 2.6.0 apache-airflow-providers-airbyte 3.2.1 apache-airflow-providers-alibaba 2.3.0 apache-airflow-providers-amazon 8.0.0 apache-airflow-providers-apache-beam 5.0.0 apache-airflow-providers-apache-cassandra 3.1.1 apache-airflow-providers-apache-drill 2.3.2 apache-airflow-providers-apache-druid 3.3.1 apache-airflow-providers-apache-flink 1.0.1 apache-airflow-providers-apache-hdfs 3.2.1 apache-airflow-providers-apache-hive 6.0.0 apache-airflow-providers-apache-impala 1.0.0 apache-airflow-providers-apache-kylin 3.1.0 apache-airflow-providers-apache-livy 3.4.0 apache-airflow-providers-apache-pig 4.0.0 apache-airflow-providers-apache-pinot 4.0.1 apache-airflow-providers-apache-spark 4.0.1 apache-airflow-providers-apache-sqoop 3.1.1 apache-airflow-providers-arangodb 2.1.1 apache-airflow-providers-asana 2.1.0 apache-airflow-providers-atlassian-jira 2.0.1 apache-airflow-providers-celery 3.1.0 apache-airflow-providers-cloudant 3.1.0 apache-airflow-providers-cncf-kubernetes 6.1.0 apache-airflow-providers-common-sql 1.4.0 apache-airflow-providers-databricks 4.1.0 apache-airflow-providers-datadog 3.2.0 apache-airflow-providers-dbt-cloud 3.1.1 apache-airflow-providers-dingding 3.1.0 apache-airflow-providers-discord 3.1.0 apache-airflow-providers-docker 3.6.0 apache-airflow-providers-elasticsearch 4.4.0 apache-airflow-providers-exasol 4.1.3 apache-airflow-providers-facebook 3.1.0 apache-airflow-providers-ftp 3.3.1 apache-airflow-providers-github 2.2.1 apache-airflow-providers-google 10.0.0 apache-airflow-providers-grpc 3.1.0 apache-airflow-providers-hashicorp 3.3.1 apache-airflow-providers-http 4.3.0 apache-airflow-providers-imap 3.1.1 apache-airflow-providers-influxdb 2.1.0 apache-airflow-providers-jdbc 3.3.0 apache-airflow-providers-jenkins 3.2.1 apache-airflow-providers-microsoft-azure 6.0.0 apache-airflow-providers-microsoft-mssql 3.3.2 apache-airflow-providers-microsoft-psrp 2.2.0 apache-airflow-providers-microsoft-winrm 3.1.1 apache-airflow-providers-mongo 3.1.1 apache-airflow-providers-mysql 5.0.0 apache-airflow-providers-neo4j 3.2.1 apache-airflow-providers-odbc 3.2.1 apache-airflow-providers-openfaas 3.1.0 apache-airflow-providers-opsgenie 5.0.0 apache-airflow-providers-oracle 3.6.0 apache-airflow-providers-pagerduty 3.1.0 apache-airflow-providers-papermill 3.1.1 apache-airflow-providers-plexus 3.1.0 apache-airflow-providers-postgres 5.4.0 apache-airflow-providers-presto 5.0.0 apache-airflow-providers-qubole 3.3.1 apache-airflow-providers-redis 3.1.0 apache-airflow-providers-salesforce 5.3.0 apache-airflow-providers-samba 4.1.0 apache-airflow-providers-segment 3.1.0 apache-airflow-providers-sendgrid 3.1.0 apache-airflow-providers-sftp 4.2.4 apache-airflow-providers-singularity 3.1.0 apache-airflow-providers-slack 7.2.0 apache-airflow-providers-smtp 1.0.1 apache-airflow-providers-snowflake 4.0.5 apache-airflow-providers-sqlite 3.3.2 apache-airflow-providers-ssh 3.6.0 apache-airflow-providers-tableau 4.1.0 apache-airflow-providers-tabular 1.1.0 apache-airflow-providers-telegram 4.0.0 apache-airflow-providers-trino 5.0.0 apache-airflow-providers-vertica 3.3.1 apache-airflow-providers-yandex 3.3.0 apache-airflow-providers-zendesk 4.2.0 apache-beam 2.46.0 apispec 5.2.2 appdirs 1.4.4 argcomplete 3.0.8 arrow 1.2.3 asana 3.2.1 asgiref 3.6.0 asn1crypto 1.5.1 async-timeout 4.0.2 atlasclient 1.0.0 atlassian-python-api 3.36.0 attrs 23.1.0 Authlib 1.2.0 Automat 0.8.0 azure-batch 13.0.0 azure-common 1.1.28 azure-core 1.26.4 azure-cosmos 4.3.1 azure-datalake-store 0.0.52 azure-identity 1.12.0 azure-keyvault-secrets 4.7.0 azure-kusto-data 0.0.45 azure-mgmt-containerinstance 1.5.0 azure-mgmt-core 1.4.0 azure-mgmt-datafactory 1.1.0 azure-mgmt-datalake-nspkg 3.0.1 azure-mgmt-datalake-store 0.5.0 azure-mgmt-nspkg 3.0.2 azure-mgmt-resource 23.0.0 azure-nspkg 3.0.2 azure-servicebus 7.9.0 azure-storage-blob 12.16.0 azure-storage-common 2.1.0 azure-storage-file 2.1.0 azure-storage-file-datalake 12.11.0 azure-synapse-spark 0.7.0 Babel 2.12.1 backcall 0.2.0 backoff 1.10.0 backports.zoneinfo 0.2.1 bcrypt 4.0.1 beautifulsoup4 4.12.2 billiard 3.6.4.0 bitarray 2.7.3 black 23.1a1 blinker 1.6.2 boto 2.49.0 boto3 1.26.76 botocore 1.29.76 cachelib 0.9.0 cachetools 5.3.0 cassandra-driver 3.26.0 cattrs 22.2.0 celery 5.2.7 certifi 2022.12.7 cffi 1.15.1 cgroupspy 0.2.2 chardet 5.1.0 charset-normalizer 2.1.1 ciso8601 2.3.0 click 8.1.3 click-didyoumean 0.3.0 click-plugins 1.1.1 click-repl 0.2.0 clickclick 20.10.2 cloud-init 23.1.2 cloudant 2.15.0 cloudpickle 2.2.1 colorama 0.4.6 colorlog 4.8.0 command-not-found 0.3 configobj 5.0.6 ConfigUpdater 3.1.1 connexion 2.14.2 constantly 15.1.0 crcmod 1.7 cron-descriptor 1.2.35 croniter 1.3.14 cryptography 40.0.2 curlify 2.2.1 dask 2022.2.0 databricks-sql-connector 2.5.1 datadog 0.45.0 db-dtypes 1.1.1 dbus-python 1.2.16 decorator 5.1.1 defusedxml 0.7.1 Deprecated 1.2.13 dill 0.3.1.1 distlib 0.3.6 distributed 2022.2.0 distro 1.4.0 distro-info 0.23ubuntu1 dnspython 2.3.0 docker 6.0.1 docopt 0.6.2 docutils 0.19 ec2-hibinit-agent 1.0.0 elasticsearch 7.13.4 elasticsearch-dbapi 0.2.10 elasticsearch-dsl 7.4.1 email-validator 1.3.1 entrypoints 0.4 et-xmlfile 1.1.0 eventlet 0.33.3 exceptiongroup 1.1.1 facebook-business 16.0.2 fastavro 1.7.3 fasteners 0.18 fastjsonschema 2.16.3 filelock 3.12.0 Flask 2.2.4 Flask-AppBuilder 4.3.0 Flask-Babel 2.0.0 Flask-Bcrypt 1.0.1 Flask-Caching 2.0.2 Flask-JWT-Extended 4.4.4 Flask-Limiter 3.3.0 Flask-Login 0.6.2 Flask-Session 0.4.0 Flask-SQLAlchemy 2.5.1 Flask-WTF 1.1.1 flower 1.2.0 frozenlist 1.3.3 fs 2.4.16 fsspec 2023.1.0 future 0.18.3 gcloud-aio-auth 4.2.1 gcloud-aio-bigquery 6.3.0 gcloud-aio-storage 8.2.0 gcsfs 2023.1.0 geomet 0.2.1.post1 gevent 22.10.2 google-api-core 2.8.2 google-api-python-client 1.12.11 google-auth 2.17.3 google-auth-httplib2 0.1.0 google-auth-oauthlib 0.8.0 google-cloud-aiplatform 1.16.1 google-cloud-appengine-logging 1.1.3 google-cloud-audit-log 0.2.4 google-cloud-automl 2.8.0 google-cloud-bigquery 2.34.4 google-cloud-bigquery-datatransfer 3.7.0 google-cloud-bigquery-storage 2.14.1 google-cloud-bigtable 2.11.1 google-cloud-build 3.9.0 google-cloud-compute 0.7.0 google-cloud-container 2.11.1 google-cloud-core 2.3.2 google-cloud-datacatalog 3.9.0 google-cloud-dataflow-client 0.5.4 google-cloud-dataform 0.2.0 google-cloud-dataplex 1.1.0 google-cloud-dataproc 5.0.0 google-cloud-dataproc-metastore 1.6.0 google-cloud-dlp 3.8.0 google-cloud-kms 2.12.0 google-cloud-language 1.3.2 google-cloud-logging 3.2.1 google-cloud-memcache 1.4.1 google-cloud-monitoring 2.11.0 google-cloud-orchestration-airflow 1.4.1 google-cloud-os-login 2.7.1 google-cloud-pubsub 2.13.5 google-cloud-redis 2.9.0 google-cloud-resource-manager 1.6.0 google-cloud-secret-manager 1.0.2 google-cloud-spanner 1.19.3 google-cloud-speech 1.3.4 google-cloud-storage 2.8.0 google-cloud-tasks 2.10.1 google-cloud-texttospeech 1.0.3 google-cloud-translate 1.7.2 google-cloud-videointelligence 1.16.3 google-cloud-vision 1.0.2 google-cloud-workflows 1.7.1 google-crc32c 1.5.0 google-resumable-media 2.5.0 googleapis-common-protos 1.56.4 gpg 1.13.1-unknown graphviz 0.20.1 greenlet 2.0.2 grpc-google-iam-v1 0.12.4 grpcio 1.54.0 grpcio-gcp 0.2.2 grpcio-status 1.48.2 gssapi 1.8.2 gunicorn 20.1.0 h11 0.14.0 hdfs 2.7.0 HeapDict 1.0.1 hibagent 1.0.1 hmsclient 0.1.1 httpcore 0.16.3 httplib2 0.21.0 httpx 0.23.3 humanize 4.6.0 hvac 1.1.0 hyperlink 19.0.0 idna 3.4 ijson 3.2.0.post0 importlib-metadata 4.13.0 importlib-resources 5.12.0 impyla 0.18.0 incremental 16.10.1 inflection 0.5.1 influxdb-client 1.36.1 ipython 7.34.0 isodate 0.6.1 itsdangerous 2.1.2 JayDeBeApi 1.2.3 jedi 0.18.2 Jinja2 3.1.2 jmespath 0.10.0 JPype1 1.4.1 json-merge-patch 0.2 jsonpatch 1.22 jsonpath-ng 1.5.3 jsonpointer 2.0 jsonschema 4.17.3 jupyter_client 7.4.9 jupyter_core 4.12.0 keyring 18.0.1 kombu 5.2.4 krb5 0.5.0 kubernetes 23.6.0 kubernetes-asyncio 24.2.2 kylinpy 2.8.4 language-selector 0.1 launchpadlib 1.10.13 lazr.restfulclient 0.14.2 lazr.uri 1.0.3 lazy-object-proxy 1.9.0 ldap3 2.9.1 limits 3.4.0 linkify-it-py 2.0.0 locket 1.0.0 lockfile 0.12.2 looker-sdk 23.6.0 lxml 4.9.2 lz4 4.3.2 Mako 1.2.4 Markdown 3.4.3 markdown-it-py 2.2.0 MarkupSafe 2.1.2 marshmallow 3.19.0 marshmallow-enum 1.5.1 marshmallow-oneofschema 3.0.1 marshmallow-sqlalchemy 0.26.1 matplotlib-inline 0.1.6 mdit-py-plugins 0.3.5 mdurl 0.1.2 monotonic 1.6 more-itertools 4.2.0 msal 1.22.0 msal-extensions 1.0.0 msgpack 1.0.5 msrest 0.7.1 msrestazure 0.6.4 multi-key-dict 2.0.3 multidict 6.0.4 mypy-boto3-appflow 1.26.123 mypy-boto3-rds 1.26.116 mypy-boto3-redshift-data 1.26.109 mypy-extensions 1.0.0 mysqlclient 2.1.1 nbclient 0.7.4 nbformat 5.8.0 neo4j 5.8.0 nest-asyncio 1.5.6 netifaces 0.10.4 numpy 1.21.6 oauthlib 3.2.2 objsize 0.6.1 openpyxl 3.1.2 opentelemetry-api 1.15.0 opentelemetry-exporter-otlp 1.15.0 opentelemetry-exporter-otlp-proto-grpc 1.15.0 opentelemetry-exporter-otlp-proto-http 1.15.0 opentelemetry-exporter-prometheus 1.12.0rc1 opentelemetry-proto 1.15.0 opentelemetry-sdk 1.15.0 opentelemetry-semantic-conventions 0.36b0 opsgenie-sdk 2.1.5 oracledb 1.3.1 ordered-set 4.1.0 orjson 3.8.11 oscrypto 1.3.0 oss2 2.17.0 packaging 21.3 pandas 1.3.5 pandas-gbq 0.17.9 papermill 2.4.0 paramiko 3.1.0 parso 0.8.3 partd 1.4.0 pathspec 0.9.0 pbr 5.11.1 pdpyras 4.5.2 pendulum 2.1.2 pexpect 4.8.0 pickleshare 0.7.5 pinotdb 0.4.14 pip 23.1.2 pkgutil_resolve_name 1.3.10 platformdirs 3.5.0 pluggy 1.0.0 ply 3.11 plyvel 1.5.0 portalocker 2.7.0 presto-python-client 0.8.3 prison 0.2.1 prometheus-client 0.16.0 prompt-toolkit 3.0.38 proto-plus 1.19.6 protobuf 3.20.0 psutil 5.9.5 psycopg2-binary 2.9.6 ptyprocess 0.7.0 pure-sasl 0.6.2 py4j 0.10.9.7 pyarrow 9.0.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pycountry 22.3.5 pycparser 2.21 pycrypto 2.6.1 pycryptodome 3.17 pycryptodomex 3.17 pydantic 1.10.7 pydata-google-auth 1.7.0 pydot 1.4.2 pydruid 0.6.5 pyexasol 0.25.2 PyGithub 1.58.1 Pygments 2.15.1 PyGObject 3.36.0 PyHamcrest 1.9.0 pyhcl 0.4.4 PyHive 0.6.5 PyJWT 2.6.0 pykerberos 1.2.4 pymacaroons 0.13.0 pymongo 3.13.0 pymssql 2.2.7 PyNaCl 1.5.0 pyodbc 4.0.39 pyOpenSSL 23.1.1 pyparsing 3.0.9 pypsrp 0.8.1 pyrsistent 0.19.3 pyserial 3.4 pyspark 3.4.0 pyspnego 0.9.0 python-apt 2.0.1+ubuntu0.20.4.1 python-arango 7.5.6 python-daemon 3.0.1 python-dateutil 2.8.2 python-debian 0.1.36ubuntu1 python-dotenv 0.21.1 python-http-client 3.3.7 python-jenkins 1.7.0 python-ldap 3.4.3 python-nvd3 0.15.0 python-slugify 8.0.1 python-telegram-bot 20.2 pytz 2023.3 pytz-deprecation-shim 0.1.0.post0 pytzdata 2020.1 pywinrm 0.4.3 PyYAML 6.0 pyzmq 25.0.2 qds-sdk 1.16.1 reactivex 4.0.4 redis 3.5.3 redshift-connector 2.0.910 regex 2022.10.31 requests 2.29.0 requests-file 1.5.1 requests-kerberos 0.14.0 requests-ntlm 1.2.0 requests-oauthlib 1.3.1 requests-toolbelt 0.10.1 requests-unixsocket 0.2.0 rfc3339-validator 0.1.4 rfc3986 1.5.0 rich 13.3.5 rich_argparse 1.1.0 rsa 4.9 s3transfer 0.6.0 sasl 0.3.1 scramp 1.4.4 scrapbook 0.5.0 SecretStorage 2.3.1 sendgrid 6.10.0 sentry-sdk 1.21.1 service-identity 18.1.0 setproctitle 1.3.2 setuptools 66.1.1 simple-salesforce 1.12.3 simplejson 3.16.0 six 1.16.0 slack-sdk 3.21.3 smbprotocol 1.10.1 snakebite-py3 3.0.5 sniffio 1.3.0 snowflake-connector-python 3.0.3 snowflake-sqlalchemy 1.4.7 sortedcontainers 2.4.0 sos 4.4 soupsieve 2.4.1 spython 0.3.0 SQLAlchemy 1.4.47 sqlalchemy-bigquery 1.6.1 sqlalchemy-drill 1.1.2 SQLAlchemy-JSONField 1.0.1.post0 sqlalchemy-redshift 0.8.14 SQLAlchemy-Utils 0.41.1 sqlparse 0.4.4 ssh-import-id 5.10 sshtunnel 0.4.0 starkbank-ecdsa 2.2.0 statsd 4.0.1 systemd-python 234 tableauserverclient 0.24 tabulate 0.9.0 tblib 1.7.0 tenacity 8.2.2 termcolor 2.3.0 text-unidecode 1.3 textwrap3 0.9.2 thrift 0.16.0 thrift-sasl 0.4.3 tomli 2.0.1 toolz 0.12.0 tornado 6.2 tqdm 4.65.0 traitlets 5.9.0 trino 0.322.0 Twisted 18.9.0 typing_extensions 4.5.0 tzdata 2023.3 tzlocal 4.3 uamqp 1.6.4 ubuntu-advantage-tools 8001 uc-micro-py 1.0.1 ufw 0.36 unattended-upgrades 0.1 unicodecsv 0.14.1 uritemplate 3.0.1 urllib3 1.26.15 vertica-python 1.3.2 vine 5.0.0 virtualenv 20.21.1 wadllib 1.3.3 watchtower 2.0.1 wcwidth 0.2.6 websocket-client 1.5.1 Werkzeug 2.2.3 wheel 0.40.0 wrapt 1.15.0 WTForms 3.0.1 xmltodict 0.13.0 yandexcloud 0.210.0 yarl 1.9.2 zeep 4.2.1 zenpy 2.0.25 zict 2.2.0 zipp 3.15.0 zope.event 4.6 zope.interface 6.0 zstandard 0.21.0

Anything else

No response

Are you willing to submit PR?

Code of Conduct

potiuk commented 1 year ago

Well. You have not enough memory. Why are you creating issue in airflow for that ? We can't get you more memory here.. you need to fix it on your side.

nitra-1 commented 1 year ago

whats recommended configuration, i have 4 core 8GB RAM ubuntu machin. I intend to run as localexecutor. i check through htop, memory usage does not cross 1 GB when i am starting scheduler.

potiuk commented 1 year ago

There is no "recommended". It Depends what your tasks are doing, what your python code does, and what else you have on the machine and whether you limit airflow process in some other ways (cgroups etc.). The error is straightforward when you start process with forking, it has not enough memory. Why - no idea - only you as admin of the machine can tell.