elastic / beats

:tropical_fish: Beats - Lightweight shippers for Elasticsearch & Logstash
https://www.elastic.co/products/beats
Other
111 stars 4.93k forks source link

Enrolling an agent immediately terminates fleet server #27321

Closed ppf2 closed 3 years ago

ppf2 commented 3 years ago

Install fleet server:

sudo ./elastic-agent install --insecure --url https://localhost:8220 -e -f --fleet-server-es=https://localhost:9200 --fleet-server-es-ca=/Users/<user>/Elastic/ElasticStack_7_0/7.14.0/elasticsearch-7.14.0/config/ca/ca.crt --fleet-server-service-token=AAEAAWVsYXN0aWMvZmxlZXQtc2VydmVyL3Rva2VuLTE2Mjg3MDg0ODY0MTE6ajRaZEFlRnlSaS0weFVla3lhRzV1Zw --fleet-server-cert /Users/<user>/Elastic/ElasticStack_7_0/7.14.0/elasticsearch-7.14.0/config/node1/node1.crt --fleet-server-cert-key /Users/<user>/Elastic/ElasticStack_7_0/7.14.0/elasticsearch-7.14.0/config/node1/node1.key

Fleet server installs and starts up fine:

2021-08-11T12:15:53.510-0700    INFO    cmd/enroll_cmd.go:701   Fleet Server - Starting
2021-08-11T12:15:54.512-0700    INFO    cmd/enroll_cmd.go:682   Fleet Server - Running on default policy with Fleet Server integration; missing config fleet.agent.id (expected during bootstrap process)
2021-08-11T12:15:54.512-0700    WARN    [tls]   tlscommon/tls_config.go:98  SSL/TLS verifications disabled.
2021-08-11T12:15:54.702-0700    INFO    cmd/enroll_cmd.go:414   Starting enrollment to URL: https://localhost:8220/
2021-08-11T12:15:54.805-0700    WARN    [tls]   tlscommon/tls_config.go:98  SSL/TLS verifications disabled.
2021-08-11T12:15:55.776-0700    INFO    cmd/enroll_cmd.go:252   Successfully triggered restart on running Elastic Agent.
Successfully enrolled the Elastic Agent.
Elastic Agent has been successfully installed.
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","ctx":"policy agent monitor","policyId":"dd4cf0f0-fad4-11eb-af6b-5fc91fbedd64","rev":3,"coord":1,"@timestamp":"2021-08-11T19:16:06.987Z","message":"new policy"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","ctx":"policy agent monitor","policyId":"dd4d3f10-fad4-11eb-af6b-5fc91fbedd64","orev":0,"nrev":3,"ocoord":0,"ncoord":1,"@timestamp":"2021-08-11T19:16:06.987Z","message":"policy revised"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","ctx":"policy agent monitor","policyId":"dd4d3f10-fad4-11eb-af6b-5fc91fbedd64","revisionIdx":3,"coordinatorIdx":1,"@timestamp":"2021-08-11T19:16:06.987Z","message":"no pending subscriptions to revised policy"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","status":"HEALTHY","@timestamp":"2021-08-11T19:16:08.542Z","message":"Running on default policy with Fleet Server integration"}

Enroll elastic agent (where elastic-agent.yml is the policy copied from Kibana instructions:

image

sudo elastic-agent -c elastic-agent.yml enroll --insecure -e -f --url https://localhost:8220 \
  --enrollment-token=eUtLTU5uc0JIVGFKVDJBZmFTX2E6SlhUNkJqeThTaC1nYXlVM0pLeDFqdw== \
  --certificate-authorities /Users/<user>/Elastic/ElasticStack_7_0/7.14.0/elasticsearch-7.14.0/config/ca/ca.crt

As soon as the above is run, Elastic agent enrolls successfully:

2021-08-11T12:17:44.828-0700    WARN    [tls]   tlscommon/tls_config.go:98  SSL/TLS verifications disabled.
2021-08-11T12:17:45.672-0700    INFO    cmd/enroll_cmd.go:414   Starting enrollment to URL: https://localhost:8220/
2021-08-11T12:17:45.779-0700    WARN    [tls]   tlscommon/tls_config.go:98  SSL/TLS verifications disabled.
2021-08-11T12:17:46.666-0700    INFO    cmd/enroll_cmd.go:252   Successfully triggered restart on running Elastic Agent.
Successfully enrolled the Elastic Agent.

But Fleet Server exits:

{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","mod":"enroll","agentId":"22c7da3c-6074-4c35-8f08-25629fc1d751","policyId":"dd4cf0f0-fad4-11eb-af6b-5fc91fbedd64","apiKeyId":"JqKmNnsBHTaJT2AfMDEf","http.request.id":"","http.response.body.bytes":1357,"event.duration":721749593,"@timestamp":"2021-08-11T19:17:46.509Z","message":"success enroll"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","ctx":"policy agent monitor","@timestamp":"2021-08-11T19:17:47.282Z","message":"Exit policy monitor local"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","status":"STOPPING","@timestamp":"2021-08-11T19:17:47.282Z","message":"Stopping"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","index":".fleet-policies","ctx":"index monitor","@timestamp":"2021-08-11T19:17:47.282Z","message":"context closed waiting for global checkpoints advance"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","index":".fleet-policies","ctx":"index monitor","@timestamp":"2021-08-11T19:17:47.282Z","message":"exited"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","index":".fleet-actions","ctx":"index monitor","@timestamp":"2021-08-11T19:17:47.282Z","message":"context closed waiting for global checkpoints advance"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","index":".fleet-actions","ctx":"index monitor","@timestamp":"2021-08-11T19:17:47.282Z","message":"exited"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T19:17:47.861Z","message":"Fleet Server exited"}
{"service.name":"fleet-server","service.name":"fleet-server","log.logger":"fleet-metrics.api","message":"Stats endpoint (/Library/Elastic/Agent/data/tmp/default/fleet-server/fleet-server.sock) finished: accept unix /Library/Elastic/Agent/data/tmp/default/fleet-server/fleet-server.sock: use of closed network connection","log.level":"info","@timestamp":"2021-08-11T19:17:47.861Z"}  

image

elasticmachine commented 3 years ago

Pinging @elastic/fleet (Team:Fleet)

ppf2 commented 3 years ago

With debugging enabled for fleet server for when it terminates after receiving the agent enrollment request:

{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","mod":"enroll","agentId":"8129fd31-d998-49c0-b39a-b5934886b6ac","policyId":"dd4cf0f0-fad4-11eb-af6b-5fc91fbedd64","apiKeyId":"6MmDN3sBdnSvIpDNfsz_","http.request.id":"","http.response.body.bytes":1358,"event.duration":763551196,"@timestamp":"2021-08-11T23:19:30.172Z","message":"success enroll"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","url.full":"/api/fleet/agents/enroll?","http.version":"1.1","http.request.method":"POST","http.response.status_code":200,"http.request.body.bytes":1021,"http.response.body.bytes":1358,"client.address":"[::1]:62228","client.ip":"::1","client.port":62228,"tls.established":true,"event.duration":763617257,"@timestamp":"2021-08-11T23:19:30.172Z","message":"HTTP handler"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","error.message":"context canceled","id":"03a119cd-be98-4069-9f42-1d6bb2d87350","http.response.status_code":503,"http.request.id":"","event.duration":31189251332,"@timestamp":"2021-08-11T23:19:30.312Z","message":"fail checkin"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","url.full":"/api/fleet/agents/03a119cd-be98-4069-9f42-1d6bb2d87350/checkin?","http.version":"1.1","http.request.method":"POST","http.response.status_code":503,"http.request.body.bytes":3559,"http.response.body.bytes":78,"client.address":"127.0.0.1:62215","client.ip":"127.0.0.1","client.port":62215,"tls.established":true,"event.duration":31189375755,"@timestamp":"2021-08-11T23:19:30.312Z","message":"HTTP handler"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.312Z","message":"Revision dispatcher exited"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","ctx":"policy agent monitor","@timestamp":"2021-08-11T23:19:30.312Z","message":"Exit policy monitor local"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.312Z","message":"Policy monitor exited"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","status":"STOPPING","@timestamp":"2021-08-11T23:19:30.312Z","message":"Stopping"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","index":".fleet-actions","ctx":"index monitor","@timestamp":"2021-08-11T23:19:30.312Z","message":"context closed waiting for global checkpoints advance"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","index":".fleet-actions","ctx":"index monitor","@timestamp":"2021-08-11T23:19:30.312Z","message":"exited"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.312Z","message":"Revision monitor exited"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.312Z","message":"Bulk checkin exited"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.312Z","message":"force server close on ctx.Done()"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","index":".fleet-policies","ctx":"index monitor","@timestamp":"2021-08-11T23:19:30.313Z","message":"context closed waiting for global checkpoints advance"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","index":".fleet-policies","ctx":"index monitor","@timestamp":"2021-08-11T23:19:30.313Z","message":"exited"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.313Z","message":"Http server exited"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.313Z","message":"Policy index monitor exited"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.891Z","message":"Coordinator policy monitor exited"}
{"log.level":"debug","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.892Z","message":"Bulker exited"}
{"log.level":"info","service.name":"fleet-server","service.name":"fleet-server","@timestamp":"2021-08-11T23:19:30.892Z","message":"Fleet Server exited"}
{"service.name":"fleet-server","service.name":"fleet-server","log.level":"info","log.logger":"fleet-metrics.api","message":"Stats endpoint (/Library/Elastic/Agent/data/tmp/default/fleet-server/fleet-server.sock) finished: accept unix /Library/Elastic/Agent/data/tmp/default/fleet-server/fleet-server.sock: use of closed network connection","@timestamp":"2021-08-11T23:19:30.892Z"}
elasticmachine commented 3 years ago

Pinging @elastic/agent (Team:Agent)

ruflin commented 3 years ago

Can you share a bit more about your setup? I assume all stack components are on 7.14. Is any of the components running in Elastic Cloud or ESS? What have you set as the fleet-server url in the Fleet UI?

nchaulet commented 3 years ago

@ppf2 From what I saw your are running everything locally correct? I do not think it's possible to run two agents on the same host (Agent and Fleet Server for example)

scunningham commented 3 years ago

Can you add the agent logs for the agent that is running the fleet server around the time the fleet server is terminated? It appears from the above fleet-server logs that it is being told to shut down.

ppf2 commented 3 years ago

@ppf2 From what I saw your are running everything locally correct? I do not think it's possible to run two agents on the same host (Agent and Fleet Server for example)

Ah, that must be it then. I am trying to start Fleet Server locally on my laptop and then enroll an agent (also tried doing it from a separate unpacked tar.gz folder) to this Fleet Server.

Can we:

thx!

blakerouse commented 3 years ago

@ppf2 If you remove the -f (meaning force) from the enroll command it would prompt you that it will overwrite the other Elastic Agent. Being that you are using -f, means that your are forcing the overwrite.

I filed a bug about documentation for this at https://github.com/elastic/observability-docs/issues/982