influxdata / influxdb

Scalable datastore for metrics, events, and real-time analytics
https://influxdata.com
Apache License 2.0
28.59k stars 3.54k forks source link

Concurrent read- and write-request leads to error "query interrupted" on 1.4 #9180

Closed lsitzmann closed 6 years ago

lsitzmann commented 6 years ago

System info: InfluxDB 1.4.2 and 1.4.1 on debian linux

Steps to reproduce:

Run concurrently many query- and write-requests againts the http endpoint. I was able to reproduce the problem with 3 parallel treads doing SELECT-Queries and 2 parallel threads doing writes against the same set of series.

Unfortunatly I can't easily provide a shell script to reproduce, because I'm running a windows .net-application as http client. But I will provide a wireshark file with the observed behaviour.

Expected behavior:

All queries should get an proper result. If server-error occur they should be logged in the journal and counted in the _internal,http measurement.

Actual behavior:

Some of the queries get an HTTP-200 result but in the response there is the following json string: {"results":[{"statement_id":0, "error":"query interrupted"}]}

See the following packet-numbers in the attached wirehark-file:

Additional info:

If I run the same (or even a much higher) load against a 1.3.7 influxdb on the same machine, all queries return the expected result. I also coundn't see any impact on changing the max-concurrent-queries parameter in the coordinator section of the influxdb.config. During the time I recorded the wireshark-file this parameter was set to 0.

You should easily be able to reproduce this creating similar load as you can see in the wireshark-file. If you can't, let me know and I'll provide additional information for you. Influxdb-quer_interrupted.pcapng.gz

stuartcarnie commented 6 years ago

Thanks for the information @lsitzmann, in particular the pcap data. The issue is not write load but a change to enable HTTP pipelining, or more specifically to enable HTTP connection reuse per https://github.com/influxdata/influxdb/commit/39591e53758649c3d576a8ccf488005b5f11146c#diff-719dcaf997cf7e555e32df7f85e7cd60

Examining the pcap file and following the HTTP requests on a single TCP connection, we can see that the 3rd GET request is sent before receiving a reply.

req-1:

GET /query?u=%20&p=%20&db=TSM&q=SELECT%20%20*%20FROM%20PT2H.tsm_data%20WHERE%20f_l0%20=%20'Archiv_1'%20AND%20f_l1%20=%20'dataNode1'%20AND%20f_l2%20=%20''%20AND%20time%20%3E=%201511971482906429300%20AND%20time%20%3C=%201511971482906529300%20ORDER%20BY%20time%20ASC%20LIMIT%200&epoch=ns HTTP/1.1
User-Agent: InfluxData.Net.InfluxDb
Accept: application/json
Host: 192.168.91.128:8086

res-1:

HTTP/1.1 200 OK
Content-Type: application/json
Request-Id: 03040a8e-d51f-11e7-81ba-000000000000
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.4.2
X-Request-Id: 03040a8e-d51f-11e7-81ba-000000000000
Date: Wed, 29 Nov 2017 16:04:42 GMT
Transfer-Encoding: chunked

{"results":[{"statement_id":0,"series":[{"name":"tsm_data","columns":["time","f_changed","f_l0","f_l1","t_old_data","vl_0_int","vl_1_int"],"values":[[1511971482906528300,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528400,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528500,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528600,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528700,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528800,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528900,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906529000,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906529100,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906529200,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906529300,1511971482907528300,"Archiv_1","dataNode1","False",999,999]]}]}]}

req-2:

GET /query?u=%20&p=%20&db=TSM&q=SELECT%20%20*%20FROM%20PT2H.tsm_data%20WHERE%20f_l0%20=%20'Archiv_1'%20AND%20f_l1%20=%20'dataNode2'%20AND%20f_l2%20=%20''%20AND%20time%20%3E=%201511971482931429300%20AND%20time%20%3C=%201511971482931529300%20ORDER%20BY%20time%20ASC%20LIMIT%200&epoch=ns HTTP/1.1
User-Agent: InfluxData.Net.InfluxDb
Accept: application/json
Host: 192.168.91.128:8086

req-3: before req-2 reply

GET /query?u=%20&p=%20&db=TSM&q=SELECT%20%20*%20FROM%20PT2H.tsm_data%20WHERE%20f_l0%20=%20'Archiv_1'%20AND%20f_l1%20=%20'dataNode1'%20AND%20f_l2%20=%20''%20AND%20time%20%3E=%201511971482906429300%20AND%20time%20%3C=%201511971482906529300%20ORDER%20BY%20time%20ASC%20LIMIT%200&epoch=ns HTTP/1.1
User-Agent: InfluxData.Net.InfluxDb
Accept: application/json
Host: 192.168.91.128:8086

res-2: canceled by req-3

HTTP/1.1 200 OK
Content-Type: application/json
Request-Id: 030640a4-d51f-11e7-81bf-000000000000
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.4.2
X-Request-Id: 030640a4-d51f-11e7-81bf-000000000000
Date: Wed, 29 Nov 2017 16:04:42 GMT
Transfer-Encoding: chunked

{"results":[{"statement_id":0,"error":"query interrupted"}]}

req-3:

HTTP/1.1 200 OK
Content-Type: application/json
Request-Id: 03065f15-d51f-11e7-81c1-000000000000
X-Influxdb-Build: OSS
X-Influxdb-Version: 1.4.2
X-Request-Id: 03065f15-d51f-11e7-81c1-000000000000
Date: Wed, 29 Nov 2017 16:04:42 GMT
Transfer-Encoding: chunked

{"results":[{"statement_id":0,"series":[{"name":"tsm_data","columns":["time","f_changed","f_l0","f_l1","t_old_data","vl_0_int","vl_1_int"],"values":[[1511971482906528300,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528400,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528500,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528600,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528700,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528800,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906528900,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906529000,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906529100,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906529200,1511971482907528300,"Archiv_1","dataNode1","False",999,999],[1511971482906529300,1511971482907528300,"Archiv_1","dataNode1","False",999,999]]}]}]}

We will do some further investigation, however you may be able to workaround the issue by either forcing your client to close the connection after each response or if possible, making sure you do not issue a subsequent request until the inflight request has completed.

jsternberg commented 6 years ago

@lsitzmann is there any chance you can give us a simplified version of the code you are using? HTTP pipelining in InfluxDB requires that you read the response before sending the next request otherwise it cancels the previous request.

jsternberg commented 6 years ago

While looking through the documentation for .NET's HttpClient, I found this: https://msdn.microsoft.com/en-us/library/system.net.http.httpcompletionoption(v=vs.118).aspx

Be sure you use ResponseContentRead and that you do not use ResponseHeadersRead. While HTTP may state that HTTP pipelining can work with that, the Go HTTP server doesn't support sending multiple headers on the same connection and I don't recommend doing that anyway since, even if it did work, it would result in the second connection being blocked by the first.

lsitzmann commented 6 years ago

Thanks a lot to @stuartcarnie and @jsternberg ! With your help I was able to locate the origin of the problem in my code. We didn't realize that we indirectly shared one instance of System.Net.Http.HttpClient across multiple threads which is not allowed according to docs. I'm sure we can fix it now in our code.

Anyway I'd still appreciate if influxdb would increment the client-error count in the _internal monitoring and/or return an http-error so we have a chance to notice such things in our monitoring on server side.

lsitzmann commented 6 years ago

After reading and learning a bit more about http 1.1 and pipelining I'm not so sure anymore, if it is a good practice to simply cancel the first request if a second request arrives before the first one is finished. As far as I understood the concept of http pipelining it is actually about sending multiple requests over the same connection without waiting for the response as shown here . So shoudn't InfluxDB be able to handle this?

stuartcarnie commented 6 years ago

@lsitzmann your understanding of HTTP pipelining is correct. A client should be able to issue multiple idempotent (GET) requests without waiting for a response. A compliant server should send replies to those requests in the same order.

Unfortunately, Go's HTTP 1.1 implementation of pipelining triggers the CloseNotifier channel for the previous request per the documentation. The CloseNotifier channel normally indicates the HTTP request was closed and we use this to cancel the query. In the case of pipelining, the CloseNotify channel is also triggered when subsequent requests are received, which is the behavior you have witnessed.

We are currently evaluating an acceptable solution.

jsternberg commented 6 years ago

In addition to what @stuartcarnie said, I would also argue that reusing the same socket for subsequent requests is likely a bad idea anyway since all of the subsequent requests would have to wait for the first request to finish. But, that's a different argument.

Can you give us a sample of the code you are using that is causing the problem?

stuartcarnie commented 6 years ago

@jsternberg agree – Go's implementation of pipelining for HTT/1.1 provides no benefit as all requests are serialized. However, HTTP/2.0 is a different story.

As for replicating the issue, it is pretty easy with nc per this asciinema. I issue the first HTTP GET request and then immediately issue a second, which you will see returns the query interrupted error.

favadi commented 6 years ago

I also has same problem with influxdb 1.4, it doesn't always happen but can be reliable reproduce if run the same set of query for dozens of times. I did a git bisect at influxdb source code and the offensive commit is: https://github.com/influxdata/influxdb/commit/39591e53758649c3d576a8ccf488005b5f11146c.

Is there something I can do on client side to mitigate this problem?

jsternberg commented 6 years ago

@favadi which language are you querying from and can you post a code snippet?

favadi commented 6 years ago

which language are you querying from

Golang, I'm using the v2 client in this repository.

and can you post a code snippet?

I will try to create a simple example to reproduce the problem. Basically, we are doing a down sampling by select sum all data points, write to a temporary database, deleted all data points, and write the result from the temporary database back.

jsternberg commented 6 years ago

I am going to be investigating how to get the clients to work properly. If I either can't figure it out or find the conclusion unsatisfying, we're going to back this out so that it works properly in 1.4.3.

The main reason to want to keep this is because otherwise a bunch of connections stay open for too long and you end up with a bunch of open sockets that are stuck in TIME_WAIT.

jsternberg commented 6 years ago

@favadi do you have a code example that I could look at?

jsternberg commented 6 years ago

@lsitzmann I could also use a code sample from you so I can see if there are any problems with the code itself.

favadi commented 6 years ago

@jsternberg it doesn't happen anymore with v1.5.0rc0 release.

jsternberg commented 6 years ago

@favadi thanks for the feedback.

I am going to leave this issue open for now, but we have no plans to remove the pipelining functionality since it helps specific use cases. If you are running into this issue and wish to utilize the old behavior, please modify your code so the client sends a Connection: close header which will force the server to close the socket after it responds to the request or modify your client to only reuse the connection after reading the full response. The benefits of this outweigh the negatives especially with larger server scalability and TCP TIME_WAIT. A more detailed article on what this is is right here: http://www.serverframework.com/asynchronousevents/2011/01/time-wait-and-its-design-implications-for-protocols-and-scalable-servers.html

jsternberg commented 6 years ago

I'm going to close this since I don't think there's anything to do on this issue at the current moment.