snjypl / airflow-provider-grafana-loki

Airflow Provider plugin for writing and reading task logs to and from Grafana Loki
Apache License 2.0
24 stars 4 forks source link

Date intervals in Loki queries too wide? #10

Open zarembat opened 1 year ago

zarembat commented 1 year ago

Our logs are taking quite a lot of time to load from Loki. I suppose it's caused by a very wide time range in which they are searched for in Loki:

start = ti.start_date - timedelta(days=15)
#if the task is running or queued, the task will not have end_date, in that
# case, we will use a resonable internal of 5 days

end_date = ti.end_date  or ti.start_date + timedelta(days=5)

end = end_date + timedelta(hours=1)

Why do we need to search for logs up to 15 days before task's start_date? Shouldn't it be just the start_date?

Also it would be nice to be able to parametrize the end_date for non-finished tasks (currently hard-coded to 5 days but that may be too much depending on the use case).

snjypl commented 1 year ago

@zarembat everytime the task is retried the start_date will get updated to the current time. so there is no way to know the time range for the older tries. so we query for the last 15 days.

if we retry the task after say 30 days. then it won't be possible to get the logs for the older task tries.

i agree that we can make the range offset configurable.

please feel free to open a PR with the changes or i will make the changes when i get a chance.

ckljohn commented 11 months ago

I found that you can change Loki setting split_queries_by_interval. The default value is 15m which means every query sends to loki splits into 15 24 4 (1440) sub queries. I changed to 24h in my environment and it helps a lot.