fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
2.92k stars 406 forks source link

Validation Failed /api/fleet/orbit/config #20781

Closed duzvik closed 3 weeks ago

duzvik commented 1 month ago

Fleet version: <!-- Copy this from the "My account" page in the Fleet UI, or run fleetctl --version --> fleetdm/fleet:v4.52.0 mysql 8.0.2

Web browser and operating system: chrome/Mac

fleet(tag: tf-mod-root-v1.10.0) installed on aws using terraform


💥  Actual behavior

TODO

🧑‍💻  Steps to reproduce

  1. in orbit logs I can see Jul 26 13:43:12 XXX orbit[3993848]: 2024-07-26T13:43:12+03:00 ERR running config receivers error="RunConfigReceivers get config: POST /api/fleet/orbit/config received status 422 Validation Failed: Error 1243 (HY000): Unknown prepared statement handler (331) given to mysql_stmt_precheck"
  2. POST request to /api/fleet/orbit/config receives response body
    {
    "message": "Validation Failed",
    "errors": [
    {
      "name": "base",
      "reason": "Error 1243 (HY000): Unknown prepared statement handler (331) given to mysql_stmt_precheck"
    }
    ]
    }
  3. ./fleetctl debug errors returns
    {
    "count": 390,
    "chain": [
      {
        "message": "Error 1243 (HY000): Unknown prepared statement handler (157) given to mysql_stmt_precheck"
      },
      {
        "message": "find host",
        "data": {
          "timestamp": "2024-07-26T11:56:10Z"
        },
        "stack": [
          "github.com/fleetdm/fleet/v4/server/datastore/mysql.(*Datastore).LoadHostByOrbitNodeKey (hosts.go:2325)",
          "github.com/fleetdm/fleet/v4/server/service.(*Service).AuthenticateOrbitHost (orbit.go:104)",
          "github.com/fleetdm/fleet/v4/server/service.newOrbitAuthenticatedEndpointer.func1.authenticatedOrbitHost.1 (endpoint_middleware.go:132)",
          "github.com/fleetdm/fleet/v4/server/service.newOrbitAuthenticatedEndpointer.func1.authenticatedOrbitHost.logged.2 (endpoint_middleware.go:223)",
          "github.com/fleetdm/fleet/v4/server/service.newServer.newServer.(*Middleware).AuthzCheck.func1.func2 (authzcheck.go:31)",
          "github.com/go-kit/kit/transport/http.Server.ServeHTTP (server.go:121)",
          "github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerRequestSize.func2 (instrument_server.go:255)",
          "net/http.HandlerFunc.ServeHTTP (server.go:2166)",
          "github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1 (instrument_server.go:296)",
          "net/http.HandlerFunc.ServeHTTP (server.go:2166)",
          "github.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1 (instrument_server.go:147)"
        ]
      },
      {
        "message": "authentication error orbit",
        "data": null,
        "stack": [
          "github.com/fleetdm/fleet/v4/server/service.(*Service).AuthenticateOrbitHost (orbit.go:111)"
        ]
      }
    ]
    }

How to fix this?

sharon-fdm commented 1 month ago

Putting on MDM board since @roperzh selfassigned.

duzvik commented 1 month ago

logs agent attached. logs from aws ECS attached. orbit-osquery_redacted.log

Screenshot 2024-07-27 at 3 02 48 PM Screenshot 2024-07-27 at 3 00 00 PM
lukeheath commented 1 month ago

@georgekarrv @PezHub Have y'all been able to reproduce this?

georgekarrv commented 1 month ago

I don't think we have, During standup yesterday we decided to look into this after finishing the release. (It currently looks like we might need to get some sql configs to be able to narrow it down)

roperzh commented 1 month ago

hey @duzvik thanks so much for all the logs, and sorry for the delay getting back to you.

I have been investigating and trying to reproduce. My current theory is that there's likely a bug in the code that caches the prepared statement.

Would it be possible to try if a server restart helps? that'll make a strong case for the cache hypothesis.

roperzh commented 4 weeks ago

I have a fix for this bug https://github.com/fleetdm/fleet/pull/21219

duzvik commented 4 weeks ago

Hi @roperzh Thanks. I will update deployment when it will be available it new terraform image.

fleet-release commented 4 weeks ago

Orbit's path now clear, Validation's strength brings cheer, Errors disappear.

fleet-release commented 3 weeks ago

MySQL errors in the cloud, Fleet's orbit code improved, Secure devices found.