elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.6k stars 8.21k forks source link

[Fleet] Agent checkin is relatively long #89412

Open mtojek opened 3 years ago

mtojek commented 3 years ago

Hi,

I've been investigating some issues around the Fleet and policies and discovered that requests to Fleet API /checkin take relatively long. Here are some logs:

kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:20:10+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"645","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":1141,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 1141ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:20:23+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"2848","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":66040,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 66040ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:21:34+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"1484","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":5645,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 5645ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:21:45+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"2333","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":20665,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 20665ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:22:10+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"1484","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":5775,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 5775ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:22:20+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"2333","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":20846,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 20846ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:22:45+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"1481","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":6048,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 6048ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:22:55+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"2328","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":40824,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 40824ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:23:40+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"13888","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":281693,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 281693ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:28:23+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"782773","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":280970,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 280970ms - 9.0B"}
kibana_1                     | {"type":"response","@timestamp":"2021-01-27T11:33:06+00:00","tags":[],"pid":8,"method":"post","statusCode":200,"req":{"url":"/api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin","method":"post","headers":{"host":"kibana:5601","user-agent":"Elastic Agent v7.11.0","content-length":"779884","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.6","userAgent":"Elastic Agent v7.11.0"},"res":{"statusCode":200,"responseTime":41035,"contentLength":9},"message":"POST /api/fleet/agents/9cfef0d0-6091-11eb-968e-51292a4c9ecb/checkin 200 41035ms - 9.0B"

6s, 20s, 28s, 40s,....

For reference: https://beats-ci.elastic.co/job/Ingest-manager/job/elastic-package/job/PR-227/29/artifact/build/elastic-stack-dump/check/logs/kibana.log

I wonder if it's not something that should be improved.

elasticmachine commented 3 years ago

Pinging @elastic/ingest-management (Team:Ingest Management)

elasticmachine commented 3 years ago

Pinging @elastic/fleet (Feature:Fleet)

blakerouse commented 3 years ago

@mtojek Checkin is a long polling API. Agent sends a request and then the HTTP connection blocks until the Agent needs to perform an action or it times out.

I think what you are seeing here is the long pooling. Let me know if I am miss understanding what you are reporting.

mtojek commented 3 years ago

The correlation between timestamps and time duration for each checkin makes sense. Thank you for the explanation.

Reopening for documenting purposes: do you think it's something that should be documented?