Druid requests go from <100ms latency to 5,10,15 second latency a few minutes after startup. Possible connection limit issue?

allegro / turnilo

Business intelligence, data exploration and visualization web application for Druid, formerly known as Swiv and Pivot

Apache License 2.0

730 stars 174 forks source link

We run Turnilo v1.40.2 in a Docker container, and it uses Plywood to talk to a Druid cluster. When we enable verbose logging in Plywood, we see requests go from

Requester rq06461 got result from query 269: (in 56ms)
[
  {
    "maxTime": "2024-03-13T23:55:00.000Z"
  }
]

Requester rq06461 got result from query 352: (in 10019ms)
[
  {
    "maxTime": "2024-03-14T15:25:00.000Z"
  }
]
TimeMonitor Got the latest time for 'REDACTED' (2024-03-14T15:25:00.000Z)
vvvvvvvvvvvvvvvvvvvvvvvvvv
Requester rq06461 got result from query 350: (in 15283ms)
[
  {
    "maxTime": "2024-03-14T15:25:00.000Z",
    "minTime": "2024-03-12T15:00:00.000Z",
    "timestamp": "2024-03-12T15:00:00.000Z"
  }
]

approximately 2-4 minutes after the container starts. I'm trying to track down the source of the issue, figured I'd raise an issue here to see if anyone had any input. Not sure if this is a Docker (podman), Turnilo, or Plywood issue.

allegro / turnilo

Druid requests go from <100ms latency to 5,10,15 second latency a few minutes after startup. Possible connection limit issue? #1098