crate / cratedb-prometheus-adapter

CrateDB Prometheus Adapter.
Apache License 2.0
60 stars 15 forks source link

Failed to POST/GET data from CrateDB: Croaks with err="context deadline exceeded" #33

Closed RyanW8 closed 9 months ago

RyanW8 commented 4 years ago

CrateDB-adapter logs in kube are being spammed with the below:

time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:50Z" level=error msg="Failed to POST inserts to Crate." err="context deadline exceeded" source="server.go:332"
time="2020-04-16T09:42:07Z" level=error msg="Failed to run select against Crate." err="context deadline exceeded" source="server.go:248"
time="2020-04-16T09:42:07Z" level=error msg="Failed to run select against Crate." err="context deadline exceeded" source="server.go:248"
time="2020-04-16T09:42:07Z" level=error msg="Failed to run select against Crate." err="context deadline exceeded" source="server.go:248"
time="2020-04-16T09:42:07Z" level=error msg="Failed to run select against Crate." err="context deadline exceeded" source="server.go:248"

CrateDB performance is good and not the issue, it seems that after restarting the cratedb-adapter it works perfect for 30 seconds or so.

amotl commented 3 years ago

Dear Ryan,

thanks for your report here and at [0], and apologies for the very late reply.

context deadline exceeded is a very generic error raised from Go which usually indicates that the connection timed out, or that some other networking issue is present, like one communication partner is trying to negotiate a TLS connection while the other one isn't prepared for that.

Can you share some more details about your version of CrateDB and the load situation?

With kind regards, Andreas.

[0] https://github.com/crate/cratedb-prometheus-adapter/issues/33 [1] https://github.com/prometheus/prometheus/issues/1438 [2] https://github.com/sensu/sensu-go/issues/3792 [3] https://stackoverflow.com/questions/49817558/context-deadline-exceeded-prometheus

amotl commented 3 years ago

Hi again,

It seems that after restarting the cratedb-adapter it works perfect for 30 seconds or so.

On this, https://github.com/crate/crate/issues/10779 also comes to mind. In this context, may I ask whether you are running CrateDB and Prometheus within a typical cloud environment or, otherwise, how specifically the cratedb-prometheus-adapter is connected to CrateDB, network-wise?

With kind regards, Andreas.

amotl commented 3 years ago

Dear Ryan,

44 improves the network behaviour slightly by adjusting the TCP timeout and keepalive settings. Now, those default values are used:

The new -tcp.connect.timeout command line option can be used to adjust the latter parameter.

With kind regards, Andreas.

P.S.: We just released version 0.4.0, which is available in form of release archives [1] and a Docker image [2].

[1] https://cdn.crate.io/downloads/dist/prometheus/ [2] https://ghcr.io/crate/cratedb-prometheus-adapter

amotl commented 1 year ago

Dear Ryan,

did you have a chance to validate if the behavior has been improved on your end with a more recent version? Otherwise, do you mind if I will close this issue? Please let me know if you need further assistance, or if this problem persists even with more recent versions.

With kind regards, Andreas.

amotl commented 1 year ago

Dear Ryan,

we are just adding a patch which aims to improve the situation.

With kind regards, Andreas.

amotl commented 9 months ago

Hi again,

the most recent release version 0.5.0 fixed this flaw. Please let us know if you still observe problems, or if you see improved behavior. Please also signal re-open if you believe the problem has not been fixed, yet.

With kind regards, Andreas.