elastic / kibana

Your window into the Elastic Stack
https://www.elastic.co/products/kibana
Other
19.62k stars 8.22k forks source link

[Fleet] setup failed: An internal server error occurred #95896

Open mtojek opened 3 years ago

mtojek commented 3 years ago

Blocker: https://github.com/elastic/elastic-package/issues/278

I spotted this while integrating with Fleet Server. Fun fact - all our integration tests passed! I mean, Kibana/Fleet/Agent behave correctly in terms of HTTP responses, but logs are full or errors. I suppose that someone would like to check it.

Fleet Server:

Attaching to elastic-package-stack_fleet-server_1
fleet-server_1               | Performing setup of Fleet in Kibana
fleet-server_1               | Kibana Fleet setup failed: http POST request to http://kibana:5601/api/fleet/setup fails: <nil>. Response: {"statusCode":500,"error":"Internal Server Error","message":"An internal server error occurred."}
fleet-server_1               | The Elastic Agent is currently in BETA and should not be used in production
fleet-server_1               | 
fleet-server_1               | 2021-03-31T08:33:22.440Z    INFO    cmd/enroll_cmd.go:421   Spawning Elastic Agent daemon as a subprocess to complete bootstrap process.
fleet-server_1               | 2021-03-31T08:33:24.444Z    INFO    cmd/enroll_cmd.go:517   Fleet Server - Starting
fleet-server_1               | 2021-03-31T08:33:27.259Z    INFO    cmd/enroll_cmd.go:190   Elastic Agent has been enrolled; start Elastic Agent
fleet-server_1               | Successfully enrolled the Elastic Agent.

Kibana:

kibana_1                     | {"type":"log","@timestamp":"2021-03-31T08:32:53+00:00","tags":["info","plugins","watcher"],"pid":919,"message":"Enabling Watcher plugin."}
kibana_1                     | {"type":"log","@timestamp":"2021-03-31T08:32:53+00:00","tags":["info","plugins","monitoring","monitoring","kibana-monitoring"],"pid":919,"message":"Starting monitoring stats collection"}
kibana_1                     | {"type":"log","@timestamp":"2021-03-31T08:32:53+00:00","tags":["error","plugins","data","data","indexPatterns"],"pid":919,"message":"ResponseError: security_exception\n    at onBody (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:337:23)\n    at IncomingMessage.onEnd (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:264:11)\n    at IncomingMessage.emit (events.js:327:22)\n    at endReadableNT (internal/streams/readable.js:1327:12)\n    at processTicksAndRejections (internal/process/task_queues.js:80:21) {\n  meta: {\n    body: { error: [Object], status: 401 },\n    statusCode: 401,\n    headers: {\n      'x-opaque-id': '6e3b6c46-717c-43e8-af75-f6ffe7d3327b',\n      'www-authenticate': 'Basic realm=\"security\" charset=\"UTF-8\", ApiKey',\n      'content-type': 'application/json; charset=UTF-8',\n      'content-length': '463'\n    },\n    meta: {\n      context: null,\n      request: [Object],\n      name: 'elasticsearch-js',\n      connection: [Object],\n      attempts: 0,\n      aborted: false\n    }\n  },\n  isBoom: true,\n  isServer: true,\n  data: null,\n  output: {\n    statusCode: 500,\n    payload: {\n      statusCode: 500,\n      error: 'Internal Server Error',\n      message: 'An internal server error occurred'\n    },\n    headers: {}\n  },\n  [Symbol(SavedObjectsClientErrorCode)]: 'SavedObjectsClient/generalError'\n}"}
kibana_1                     | {"type":"response","@timestamp":"2021-03-31T08:32:53+00:00","tags":[],"pid":919,"method":"get","statusCode":200,"req":{"url":"/login","method":"get","headers":{"host":"127.0.0.1:5601","user-agent":"curl/7.61.1","accept":"*/*"},"remoteAddress":"127.0.0.1","userAgent":"curl/7.61.1"},"res":{"statusCode":200,"responseTime":141,"contentLength":129396},"message":"GET /login 200 141ms - 126.4KB"}
kibana_1                     | {"type":"log","@timestamp":"2021-03-31T08:32:54+00:00","tags":["warning","plugins","reporting"],"pid":919,"message":"Enabling the Chromium sandbox provides an additional layer of protection."}
kibana_1                     | {"type":"log","@timestamp":"2021-03-31T08:32:54+00:00","tags":["error","plugins","data","data","indexPatterns"],"pid":919,"message":"ResponseError: security_exception\n    at onBody (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:337:23)\n    at IncomingMessage.onEnd (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:264:11)\n    at IncomingMessage.emit (events.js:327:22)\n    at endReadableNT (internal/streams/readable.js:1327:12)\n    at processTicksAndRejections (internal/process/task_queues.js:80:21) {\n  meta: {\n    body: { error: [Object], status: 401 },\n    statusCode: 401,\n    headers: {\n      'x-opaque-id': 'f9de0156-5c77-4ed6-806f-99647cf4b775',\n      'www-authenticate': 'Basic realm=\"security\" charset=\"UTF-8\", ApiKey',\n      'content-type': 'application/json; charset=UTF-8',\n      'content-length': '463'\n    },\n    meta: {\n      context: null,\n      request: [Object],\n      name: 'elasticsearch-js',\n      connection: [Object],\n      attempts: 0,\n      aborted: false\n    }\n  },\n  isBoom: true,\n  isServer: true,\n  data: null,\n  output: {\n    statusCode: 500,\n    payload: {\n      statusCode: 500,\n      error: 'Internal Server Error',\n      message: 'An internal server error occurred'\n    },\n    headers: {}\n  },\n  [Symbol(SavedObjectsClientErrorCode)]: 'SavedObjectsClient/generalError'\n}"}
kibana_1                     | {"type":"response","@timestamp":"2021-03-31T08:32:54+00:00","tags":[],"pid":919,"method":"get","statusCode":200,"req":{"url":"/login","method":"get","headers":{"host":"127.0.0.1:5601","user-agent":"curl/7.61.1","accept":"*/*"},"remoteAddress":"127.0.0.1","userAgent":"curl/7.61.1"},"res":{"statusCode":200,"responseTime":39,"contentLength":129396},"message":"GET /login 200 39ms - 126.4KB"}
kibana_1                     | {"type":"log","@timestamp":"2021-03-31T08:32:54+00:00","tags":["error","http","server","Kibana"],"pid":919,"message":"ResponseError: unavailable_shards_exception\n    at onBody (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:337:23)\n    at IncomingMessage.onEnd (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:264:11)\n    at IncomingMessage.emit (events.js:327:22)\n    at endReadableNT (internal/streams/readable.js:1327:12)\n    at processTicksAndRejections (internal/process/task_queues.js:80:21) {\n  meta: {\n    body: { error: [Object], status: 503 },\n    statusCode: 503,\n    headers: {\n      'x-opaque-id': '5cd5237b-aaf1-461f-89c5-05a2d8daf34b',\n      'content-type': 'application/json; charset=UTF-8',\n      'content-length': '279'\n    },\n    meta: {\n      context: null,\n      request: [Object],\n      name: 'elasticsearch-js',\n      connection: [Object],\n      attempts: 3,\n      aborted: false\n    }\n  }\n}"}
kibana_1                     | {"type":"error","@timestamp":"2021-03-31T08:32:54+00:00","tags":[],"pid":919,"level":"error","error":{"message":"Internal Server Error","name":"Error","stack":"Error: Internal Server Error\n    at HapiResponseAdapter.toInternalError (/usr/share/kibana/src/core/server/http/router/response_adapter.js:61:19)\n    at interceptRequest (/usr/share/kibana/src/core/server/http/lifecycle/on_post_auth.js:68:34)\n    at runMicrotasks (<anonymous>)\n    at processTicksAndRejections (internal/process/task_queues.js:93:5)\n    at exports.Manager.execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/toolkit.js:60:28)\n    at Request._invoke (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:397:30)\n    at Request._lifecycle (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:370:32)\n    at Request._execute (/usr/share/kibana/node_modules/@hapi/hapi/lib/request.js:279:9)"},"url":"http://kibana:5601/api/fleet/setup","message":"Internal Server Error"}
kibana_1                     | {"type":"response","@timestamp":"2021-03-31T08:32:54+00:00","tags":["access:fleet-read"],"pid":919,"method":"post","statusCode":500,"req":{"url":"/api/fleet/setup","method":"post","headers":{"host":"kibana:5601","user-agent":"Go-http-client/1.1","content-length":"0","accept":"application/json","content-type":"application/json","kbn-xsrf":"1","accept-encoding":"gzip"},"remoteAddress":"172.19.0.5","userAgent":"Go-http-client/1.1"},"res":{"statusCode":500,"responseTime":168,"contentLength":97},"message":"POST /api/fleet/setup 500 168ms - 97.0B"}
kibana_1                     | {"type":"log","@timestamp":"2021-03-31T08:32:55+00:00","tags":["error","plugins","data","data","indexPatterns"],"pid":919,"message":"ResponseError: security_exception\n    at onBody (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:337:23)\n    at IncomingMessage.onEnd (/usr/share/kibana/node_modules/@elastic/elasticsearch/lib/Transport.js:264:11)\n    at IncomingMessage.emit (events.js:327:22)\n    at endReadableNT (internal/streams/readable.js:1327:12)\n    at processTicksAndRejections (internal/process/task_queues.js:80:21) {\n  meta: {\n    body: { error: [Object], status: 401 },\n    statusCode: 401,\n    headers: {\n      'x-opaque-id': 'e7a47c03-7063-4b50-8efb-2752a83e9218',\n      'www-authenticate': 'Basic realm=\"security\" charset=\"UTF-8\", ApiKey',\n      'content-type': 'application/json; charset=UTF-8',\n      'content-length': '463'\n    },\n    meta: {\n      context: null,\n      request: [Object],\n      name: 'elasticsearch-js',\n      connection: [Object],\n      attempts: 0,\n      aborted: false\n    }\n  },\n  isBoom: true,\n  isServer: true,\n  data: null,\n  output: {\n    statusCode: 500,\n    payload: {\n      statusCode: 500,\n      error: 'Internal Server Error',\n      message: 'An internal server error occurred'\n    },\n    headers: {}\n  },\n  [Symbol(SavedObjectsClientErrorCode)]: 'SavedObjectsClient/generalError'\n}"}
kibana_1                     | {"type":"response","@timestamp":"2021-03-31T08:32:55+00:00","tags":[],"pid":919,"method":"get","statusCode":200,"req":{"url":"/login","method":"get","headers":{"host":"127.0.0.1:5601","user-agent":"curl/7.61.1","accept":"*/*"},"remoteAddress":"127.0.0.1","userAgent":"curl/7.61.1"},"res":{"statusCode":200,"responseTime":70,"contentLength":129396},"message":"GET /login 200 70ms - 126.4KB"}
kibana_1                     | {"type":"log","@timestamp":"2021-03-31T08:32:56+00:00","tags":["info","plugins","securitySolution"],"pid":919,"message":"Skipping Artifacts migration to fleet. [fleetServerEnabled] flag is off"}

(source: https://beats-ci.elastic.co/job/Ingest-manager/job/elastic-package/job/PR-279/33/artifact/build/elastic-stack-dump/check/logs/kibana.log)

Elasticsearch:

elasticsearch_1              | "stacktrace": ["org.elasticsearch.action.UnavailableShardsException: at least one primary shard for the index [.security-7] is unavailable",
elasticsearch_1              | "at org.elasticsearch.xpack.security.support.SecurityIndexManager.getUnavailableReason(SecurityIndexManager.java:148) ~[x-pack-security-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT]",
elasticsearch_1              | "at org.elasticsearch.xpack.security.authz.store.NativePrivilegeStore.innerGetPrivileges(NativePrivilegeStore.java:187) [x-pack-security-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT]",
elasticsearch_1              | "at org.elasticsearch.xpack.security.authz.store.NativePrivilegeStore.getPrivileges(NativePrivilegeStore.java:161) [x-pack-security-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT]",
elasticsearch_1              | "at org.elasticsearch.xpack.security.action.user.TransportHasPrivilegesAction.resolveApplicationPrivileges(TransportHasPrivilegesAction.java:89) [x-pack-security-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT]",
elasticsearch_1              | "at org.elasticsearch.xpack.security.action.user.TransportHasPrivilegesAction.doExecute(TransportHasPrivilegesAction.java:81) [x-pack-security-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT]",
elasticsearch_1              | "at org.elasticsearch.xpack.security.action.user.TransportHasPrivilegesAction.doExecute(TransportHasPrivilegesAction.java:38) [x-pack-security-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT]",
elasticsearch_1              | "at org.elasticsearch.action.support.TransportAction$RequestFilterChain.proceed(TransportAction.java:173) [elasticsearch-7.13.0-SNAPSHOT.jar:7.13.0-SNAPSHOT]",

(source: https://beats-ci.elastic.co/job/Ingest-manager/job/elastic-package/job/PR-279/33/artifact/build/elastic-stack-dump/check/logs/elasticsearch.log) - not sure if this one is relevant.

elasticmachine commented 3 years ago

Pinging @elastic/fleet (Team:Fleet)

mtojek commented 3 years ago

Another stack trace:

Performing setup of Fleet in Kibana
Kibana Fleet setup failed: http POST request to http://kibana:5601/api/fleet/setup fails: <nil>. Response: {"statusCode":500,"error":"Internal Server Error","message":"An internal server error occurred."}
Kibana Fleet setup failed: http POST request to http://kibana:5601/api/fleet/setup fails: <nil>. Response: {"statusCode":500,"error":"Internal Server Error","message":"An internal server error occurred."}
The Elastic Agent is currently in BETA and should not be used in production

2021-03-31T10:53:41.842Z    INFO    cmd/enroll_cmd.go:185   Elastic Agent might not be running; unable to trigger restart
2021-03-31T10:53:41.842Z    INFO    cmd/enroll_cmd.go:187   Successfully triggered restart on running Elastic Agent.
Successfully enrolled the Elastic Agent.
ph commented 3 years ago

Not sure here what could be the cause here, @nchaulet or @urso have you seen this?

mtojek commented 3 years ago

Another occurrence:

https://beats-ci.elastic.co/job/Ingest-manager/job/integrations/job/master/171/artifact/build/elastic-stack-dump/latest/proofpoint/logs/elastic-agent.log https://beats-ci.elastic.co/job/Ingest-manager/job/integrations/job/master/171/artifact/build/elastic-stack-dump/latest/proofpoint/logs/elasticsearch.log https://beats-ci.elastic.co/job/Ingest-manager/job/integrations/job/master/171/artifact/build/elastic-stack-dump/latest/proofpoint/logs/kibana.log https://beats-ci.elastic.co/job/Ingest-manager/job/integrations/job/master/171/artifact/build/elastic-stack-dump/latest/proofpoint/logs/package-registry.log

and again in pair with .security-7 index issue. Let me know if you need to more data.

ruflin commented 3 years ago

My suspicion is that the .security-7 error is somehow related to the docker setup. We should figure out how to get rid of this issue and see if the other problems persist.

mtojek commented 3 years ago

We tried here: https://github.com/elastic/kibana/issues/90148

mtojek commented 3 years ago

Maybe we can improve the healthcheck by checking the status of http://localhost:9200/_cat/shards:

.security-7                                                       0 p STARTED      65 202.8kb 127.0.0.1 532fa6814a6a

We can check if that shard has been started, WDYT?

ruflin commented 3 years ago

@ycombinator Any idea?

ycombinator commented 3 years ago

Looking at the errors in the various logs, this looks like an internal Kibana error to me. When the Fleet Server (or is it Agent? or either?) is calling POST /api/fleet/setup, the Kibana code behind those APIs is trying to make some Elasticsearch API call, which fails because the .security index isn't green yet.

A similar thing seems to be happening when GET /login is being called. The user agent there is curl so I'm not sure exactly what is calling that or if it was a human doing some testing.

In either case, I think it would be great if Kibana could expose a Fleet health check API that will allow Fleet Server or Agent to ask the question, "Is Fleet ready or not?" before calling POST /api/fleet/setup or any other Fleet APIs. Maybe there is or there can be something in the GET /api/status Kibana API response or maybe there should be a dedicated GET /api/fleet/status endpoint for this. Internally within Fleet UI code, this status API would make sure the Elasticsearch Security index is ready (i.e. not red) before it tries to use it; until then the status API would respond as "not ready". To check if the index is not ready, this Elasticsearch API call could be used: http://localhost:9200/_cat/indices/.security-*?h=health&format=json.

Thoughts?

mtojek commented 3 years ago

I think I partially mimicked it here: https://github.com/elastic/elastic-package/pull/311 . Could you please take a look?