Closed adamgorkaextbi closed 1 hour ago
Thanks for opening your first issue here! Be sure to follow the issue template! If you are willing to raise PR to address this issue please do so, no need to wait for approval.
Same here, occurs after upgrading from 2.8.2 to 2.10.1
still same issue on 2.10.2 version
nice video in duplicated issue: https://github.com/apache/airflow/issues/42243
width of
Hi @adamgorkaextbi, are there any specific steps to reproduce this? I'm looking into this but can't seem to reproduce the issue on my end. Thanks!
@dannyl1u How to reproduce Use airflow official image and helm chart. Database backend is PostgreSQL. Airflow is using UTC time. Server is running at UTC. users work in CEST (+2h versus UTC time). Externally trigger is used to run DAGs. Migrate from airflow 2.7 (that has its own Gantt chart issues) to 2.9.3 (where flickering appear) and then 2.10.1 and then 2.10.2 what in my opinion lead to situation where incorrect values of Queued at (old values and new once) are generated (since airflow 2.7 Queued at has incorrect value +2h after start date, Queued at should occurs before started date) and this may lead Front End to incorrectly and constantly re-compute positions (padding and width) of Tasks on Gantt chart
I have checked and queued_dttm alias Queued at in database has correct value stored (datetime with timezone) I suspect fronted end is processing or receiving queued_dttm that is used and displayed "Queued at" without taking into account time zone for this field this is why on our www we see +2h and why Gantt chart is getting crazy
task details and Gantt chart both use reacts useGridData() to get data
read this: https://github.com/apache/airflow/blob/6fa33191bf13d0c2caa33bb7593f1136a5eead45/airflow/www/static/js/types/api-generated.ts#L1521 then read this: https://github.com/apache/airflow/blob/6fa33191bf13d0c2caa33bb7593f1136a5eead45/airflow/api_connexion/schemas/task_instance_schema.py#L23 then read this: https://github.com/apache/airflow/blob/6fa33191bf13d0c2caa33bb7593f1136a5eead45/airflow/api_connexion/schemas/task_instance_schema.py#L67 then read this: https://github.com/marshmallow-code/marshmallow-sqlalchemy/blob/bd7d9edaac005c3aa6ac0c5e0b58a6d42619c8b6/src/marshmallow_sqlalchemy/schema.py#L191 then read this: https://github.com/marshmallow-code/marshmallow-sqlalchemy/blob/bd7d9edaac005c3aa6ac0c5e0b58a6d42619c8b6/src/marshmallow_sqlalchemy/schema.py#L191-L213 then read this: https://github.com/marshmallow-code/marshmallow/blob/6a0389b6aca79e95c923adb55f497f20e75e0d86/src/marshmallow/fields.py#L90-L92 https://github.com/marshmallow-code/marshmallow/blob/6a0389b6aca79e95c923adb55f497f20e75e0d86/src/marshmallow/fields.py#L710-L729 I suspect the bug could be in marshmallow repository So I guess solution would be to not rename queued_dttm to queued_when (string) with auto_field(data_key="queued_when") start_date and end_date are correctly converted to string while queued_dttm is not - only one difference I find out till is usage of data_key
@adamgorkaextbi @Shlomixg I've tried it on my DAGs (also on 2.10.0) and the Gantt charts are fine on my end. Could you please share your DAG file? If it contains sensitive information, a sanitized or similar version would be fine, as long as it still reproduces the Gantt chart problem
@dannyl1u I guess you need to apply airflow migration scripts to reproduce issue.
I check one more time airflow schema after migration before I checked only task_instance table where queued_dttm has correct type timestamptz but today I also check DAG_run table schema according to airflow db model queued_at schould be timestamptz @dannyl1u Can you double check schema on your side?
I guess we will try manually change this column type in our DB and let you know if this fix issue. Still I guess one of migration scripts will require fixing
If you are running airflow in + timezone EUROPE/ASIA (USA timezones are not affected with flickering, but still data has incorrect type db)
SOLUTION:
"""
ALTER TABLE .dag_run ALTER COLUMN queued_at TYPE timestamptz USING queued_at AT TIME ZONE '
TODO: Correct airflow db migration script with setting correct types of this columns during migration FIX UI react logic to deal with incorrect timestamp order (start_date, queued_at, end_date) or to check timestamp order and report exception instead of flickering or other unexpected behavior Add unittests for processing incorrect data on frontend in case of Gantt Chart
Also getting this issue on 2.10.3. Happens on seemingly random DAGs with the gantt UI timescales constantly flickering.
To me, this happens when I have multiple retries for tasks.
this work for me : in paris ALTER TABLE dag_run ALTER COLUMN queued_at TYPE timestamptz USING queued_at AT TIME ZONE 'Europe/Paris';
Airflow Version: v2.10.0 Git Version: .release:e001b88f5875cfd7e295891a0bbdbc75a3dccbfb Deployment: Official Apache Airflow Helm Cart
https://github.com/user-attachments/assets/a71da12d-a18d-454b-b8a2-8b16fd7db9d5
Same problem on 2.10.3, but as @leetdavid said, it occurs only when there is a retried task. We're using MySQL for the database, so it doesn't seem to be just a timezone issue.
For me, it seems more related to the fact that the Gantt graph attempts to show task executions present in task_instance for a run_id, but within the period between the queued_at and end_date of the dag_run. (If I manually change queued_at to encompass all task_instance, the flickering stops.)
The minimum date for the Gantt graph shouldn't be queued_at but rather the minimum start_date of task_instance
The closed pull request above should solve the problem but since I don't understand what to do the validation error message...
basically this part of the airflow/www/static/js/dag/details/gantt/index.tsx file
// Reset state when the dagrun changes
useEffect(() => {
if (startDate !== dagRun?.queuedAt && startDate !== dagRun?.startDate) {
setStartDate(dagRun?.queuedAt || dagRun?.startDate);
}
if (!endDate || endDate !== dagRun?.endDate) {
// @ts-ignore
setEndDate(dagRun?.endDate ?? moment().add(1, "s").toString());
}
}, [
dagRun?.queuedAt,
dagRun?.startDate,
dagRun?.endDate,
startDate,
endDate,
]);
is triggered when startDate/endDate and set dagRun?.queuedAt || dagRun?.startDate as start date but doing so it seems to trigger a redraw of the graph which call setGanttDuration for each task instance that could lead to a change of the startDate/endDate entering in an infinite loop of date changes.
my pull request tried to solve the issue by removing the startDate/endDate in the "reset" declaration, it works if I rebuild javascript on my local airflow instance, but doesn't pass the github validation process. It seems that just removing startDate/endDate make the file inconsistent and since my knowledge in reactjs is almost 0, I will let someone with a better understanding fix this error
Apache Airflow version
2.10.1
If "Other Airflow 2 version" selected, which one?
No response
What happened?
Gantt chart is flickering due constant rescaling "Queued at" time is computed incorrectly +2h to start and end time of DAG
What you think should happen instead?
I should see correct Gantt chart or at lease not flickering
How to reproduce
We migrate from airflow 2.7 to 2.9.3(same Gantt issue) and 2.10.1
Operating System
airflow docker release
Versions of Apache Airflow Providers
No response
Deployment
Official Apache Airflow Helm Chart
Deployment details
Helm chart
Anything else?
We migrate from airflow 2.7 to 2.9.3(same Gantt issue) and 2.10.1
Are you willing to submit PR?
Code of Conduct