vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.47k stars 584 forks source link

Slow feed rate using the vespa-feed-client towards the tail end of the feed #19487

Closed karowan closed 2 years ago

karowan commented 2 years ago

Describe the bug I noticed that the throughput was great towards the beginning of the feed session but as it got to the tail of the feed, the speed very heavily declined. For some reference here is what it looked like up until ~97% of the documents were fed in, after which the feed slowed down to a crawl.

{
  "feeder.seconds" : 810.032,
  "feeder.ok.count" : 4216884,
  "feeder.ok.rate" : 5205.822,
  "feeder.error.count" : 0,
  "feeder.inflight.count" : 301,
  "http.request.count" : 4228400,
  "http.request.bytes" : 245548039,
  "http.exception.count" : 11215,
  "http.response.count" : 4216884,
  "http.response.bytes" : 498139966,
  "http.response.error.count" : 0,
  "http.response.latency.millis.min" : 34.000,
  "http.response.latency.millis.avg" : 67.000,
  "http.response.latency.millis.max" : 1332.000,
  "http.response.code.counts" : {
    "200" : 4216884
  }
}

Here are a set of logs towards the end of the feed:

{
  "feeder.seconds" : 1890.056,
  "feeder.ok.count" : 4323047,
  "feeder.ok.rate" : 2287.258,
  "feeder.error.count" : 0,
  "feeder.inflight.count" : 2,
  "http.request.count" : 4334415,
  "http.request.bytes" : 251694900,
  "http.exception.count" : 11366,
  "http.response.count" : 4323047,
  "http.response.bytes" : 510243102,
  "http.response.error.count" : 0,
  "http.response.latency.millis.min" : 34.000,
  "http.response.latency.millis.avg" : 66.000,
  "http.response.latency.millis.max" : 5231.000,
  "http.response.code.counts" : {
    "200" : 4323047
  }
}{
  "feeder.seconds" : 1900.057,
  "feeder.ok.count" : 4323436,
  "feeder.ok.rate" : 2275.425,
  "feeder.error.count" : 0,
  "feeder.inflight.count" : 2,
  "http.request.count" : 4334804,
  "http.request.bytes" : 251717558,
  "http.exception.count" : 11366,
  "http.response.count" : 4323436,
  "http.response.bytes" : 510287448,
  "http.response.error.count" : 0,
  "http.response.latency.millis.min" : 34.000,
  "http.response.latency.millis.avg" : 66.000,
  "http.response.latency.millis.max" : 5231.000,
  "http.response.code.counts" : {
    "200" : 4323436
  }
}

From what I can tell, after a certain point the number of in flight requests dropped immediately from the hundreds in one log to only 18 in flight requests in the next, at which point it continuously declined to 1 until the feed was completed.

To Reproduce Steps to reproduce the behavior: Unfortunately I cannot give exact steps to reproduce as I noticed this issue in a production environment so I do not have a sample app to give.

This was done using the zip of https://search.maven.org/artifact/com.yahoo.vespa/vespa-feed-client-cli/7.479.3/jar The parameters used were as follows:

java -jar vespa-feed-client-cli-jar-with-dependencies.jar  --verbose --certificate <CERT> --private-key <PRIVATE_KEY> --endpoint https://<ENDPOINT>:443/ --connections 4 --file <FILE>

The file consisted of 4332651 lines with the format as follows:

{"update": "id:content:doc_type::1234", "fields": {"name": {"assign": "this is the name"}}}

Expected behavior I would expect for the feed client to have a consistent stream of data until the feed is completed.

Environment (please complete the following information): Vespa Cloud

Vespa version 7.478.19

jonmv commented 2 years ago

This behaviour was due to multiple operations (thousands) being performed against common documents. These operations are serialised by the client, i.e., each operation against a document with a certain document id is put on hold until the previous operation against that same document is complete.