Closed iameugenejo closed 3 years ago
Thanks for opening your first issue here! Be sure to follow the issue template!
@iameugenejo can you share more details about the issue? how often does it happen? effecting specific dag or all dags in the system?
Without reproduce steps / more information it might be hard to understand the root cause
@eladkal , it happened 5 times so far since 2/20.
It's happening to 1 specific dag.
The dag itself is static but the tasks the dag executes are generated dynamically.
The other dags that are not showing this symptom have their tasks statically coded.
This issue has been automatically marked as stale because it has been open for 30 days with no response from the author. It will be closed in next 7 days if no further activity occurs from the issue author.
@iameugenejo can you share the DAG code? we need more information here. If we can't reproduce it's almost impossible to find a fix.
this hasn't happened for the past month or so.
The following is the dag with some values redacted.
from airflow.models import DAG
from airflow.operators.bash import BashOperator
import sys
import pendulum
from airflow.utils import timezone
DAG_ID = '{REDACTED}'
now = pendulum.now(timezone.utc)
schedule_interval = '0 16 * * *'
start_date = now - timedelta(days=1)
max_active_runs = 1
num_of_tasks = 6
email_ids = '{REDACTED}'
with DAG(
dag_id=DAG_ID,
start_date=start_date,
max_active_runs=max_active_runs,
default_args={
'owner': 'airflow',
'start_date': start_date,
'max_active_runs': max_active_runs,
'email': email_ids,
'email_on_failure': True,
'email_on_retry': True
},
schedule_interval=schedule_interval,
dagrun_timeout=timedelta(seconds=43200), # 6 hours
catchup=False
) as dag:
tasks = []
for i in range(0, num_of_tasks):
tasks.append(BashOperator(
task_id='redacted_'+str(i+1),
retries=10,
retry_delay=timedelta(seconds=60), # 1 minute retry delay
retry_exponential_backoff=True,
max_retry_delay=timedelta(seconds=900), # 15 minutes max retry delay
do_xcom_push=True, # return the last line from the stdout
bash_command="REDACTED.sh {} {} ".format(int(i), int(num_of_tasks)),
dag=dag))
if i != 0:
tasks[i-1] >> tasks[i]
@eladkal were you able to validate this? Just trying to get an idea what the status is
I wasn't able to reproduce but I think it's related to the dynamic start_date used in the DAG which is a bad practice and can lead to all kind of undesired behavior.
start_date = now - timedelta(days=1)
I tend to close this issue
Thanks @eladkal.
@iameugenejo are you able to replicate this bug even if you remove the dynamic start_date? If not, I agree with @eladkal that we can probably chalk it up to the dynamic start date.
@kaxil Is it possible/desirable to add a check for dynamic start dates and to throw an error or warning?
Dynamic start_date is still there and the issue hasn't happened for the past few months, so it might not be about the dynamic start_date.
But since I'm not seeing the issue anymore, I don't mind closing this issue and reopening it when it occurs again
@iameugenejo Sounds good. I'll close for now
I'm having an exactly same issue as this user - https://www.reddit.com/r/dataengineering/comments/lri9fv/airflow_dag_is_skipping_a_day/
Apache Airflow version: 2.0.1
Environment:
uname -a
): Linux {REDACTED} 5.4.0-64-generic #72~18.04.1-Ubuntu SMP Fri Jan 15 14:06:34 UTC 2021 x86_64 x86_64 x86_64 GNU/LinuxThe scheduler log is there for the missing date without showing any errors.
And there was no manual runs at all for this dag.