teslamotors / fleet-telemetry

Apache License 2.0
625 stars 66 forks source link

HTTP Panic When Trying to Connect to Kafka #178

Closed rileymd88 closed 3 weeks ago

rileymd88 commented 3 weeks ago

I am getting the following error when trying to connect to kafka:

2024/06/10 20:16:07 http: panic serving 176.199.247.105:42518: runtime error: invalid memory address or nil pointer dereference
goroutine 34 [running]:
net/http.(*conn).serve.func1()
    /usr/local/go/src/net/http/server.go:1854 +0xbf
panic({0x12f8640, 0x1f21ec0})
    /usr/local/go/src/runtime/panic.go:890 +0x263
github.com/sirupsen/logrus.(*Logger).Logf(0x8?, 0xffffffff?, {0x141c9d9?, 0x1?}, {0xc0002c3150?, 0x9?, 0xc000320028?})
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/logger.go:152 +0x22
github.com/sirupsen/logrus.(*Logger).Errorf(...)
    /go/pkg/mod/github.com/sirupsen/logrus@v1.9.0/logger.go:186
github.com/teslamotors/fleet-telemetry/datastore/kafka.(*Producer).logError(0xc0000929a0?, {0x1722de0?, 0xc00041c400?})
    /go/src/fleet-telemetry/datastore/kafka/kafka.go:113 +0x65
github.com/teslamotors/fleet-telemetry/datastore/kafka.(*Producer).Produce(0xc000038040, 0xc0001ec6c0)
    /go/src/fleet-telemetry/datastore/kafka/kafka.go:78 +0x5e9
github.com/teslamotors/fleet-telemetry/telemetry.(*BinarySerializer).Dispatch(...)
    /go/src/fleet-telemetry/telemetry/serializer.go:93
github.com/teslamotors/fleet-telemetry/telemetry.(*Record).Dispatch(0xc0001ec6c0)
    /go/src/fleet-telemetry/telemetry/record.go:100 +0x13a
github.com/teslamotors/fleet-telemetry/server/streaming.(*SocketManager).processRecord(0xc0000eb760?, 0xc0001ec6c0)
    /go/src/fleet-telemetry/server/streaming/socket.go:263 +0x25
github.com/teslamotors/fleet-telemetry/server/streaming.(*SocketManager).ParseAndProcessRecord(0xc0002f7dd0, 0xc00041c360, {0xc00009c600?, 0x18?, 0xc0002c3770?})
    /go/src/fleet-telemetry/server/streaming/socket.go:254 +0x673
github.com/teslamotors/fleet-telemetry/server/streaming.(*SocketManager).ProcessTelemetry(0xc0002f7dd0, 0xc0002f7dd0?)
    /go/src/fleet-telemetry/server/streaming/socket.go:218 +0x605
github.com/teslamotors/fleet-telemetry/server/streaming.(*Server).ServeBinaryWs.func1({0x1727b20?, 0xc0002e8540?}, 0xc000170e00)
    /go/src/fleet-telemetry/server/streaming/server.go:99 +0x2d3
net/http.HandlerFunc.ServeHTTP(0xc00037d5e0?, {0x1727b20?, 0xc0002e8540?}, 0xc0002c3920?)
    /usr/local/go/src/net/http/server.go:2122 +0x2f
net/http.(*ServeMux).ServeHTTP(0xc00009e500?, {0x1727b20, 0xc0002e8540}, 0xc000170e00)
    /usr/local/go/src/net/http/server.go:2500 +0x149
github.com/teslamotors/fleet-telemetry/server/streaming.serveHTTPWithLogs.func1({0x1727b20, 0xc0002e8540}, 0xc000170e00)
    /go/src/fleet-telemetry/server/streaming/server.go:70 +0x236
net/http.HandlerFunc.ServeHTTP(0x0?, {0x1727b20?, 0xc0002e8540?}, 0x50794e?)
    /usr/local/go/src/net/http/server.go:2122 +0x2f
net/http.serverHandler.ServeHTTP({0xc0002adb90?}, {0x1727b20, 0xc0002e8540}, 0xc000170e00)
    /usr/local/go/src/net/http/server.go:2936 +0x316
net/http.(*conn).serve(0xc000279cb0, {0x1728668, 0xc0004d1590})
    /usr/local/go/src/net/http/server.go:1995 +0x612
created by net/http.(*Server).Serve
    /usr/local/go/src/net/http/server.go:3089 +0x5ed

My config looks like this:

{
  "host": "0.0.0.0",
  "hostname": "mydomain.com",
  "port": 1000,
  "log_level": "debug",
  "json_log_enable": true,
  "namespace": "telemetry",
  "reliable_ack": true,
  "reliable_ack_sources": {
    "V": "kafka"
  },
  "kafka": {
    "bootstrap.servers": "kafka-server:9092",
    "sasl.mechanism": "SCRAM-SHA-256",
    "security.protocol": "SASL_SSL",
    "sasl.username": "username",
    "sasl.password": "password"
  },
  "records": {
    "alerts": ["logger"],
    "errors": ["logger"],
    "V": ["kafka"]
  },
  "tls": {
    "server_cert": "cert.pem",
    "server_key": "private.pem"
  },
  "ca": "-----BEGIN CERTIFICATE-----<content of full cert chain file>-----END CERTIFICATE-----\n"
}

The error only happens after adding my kafka config. If I change the records.V config entry from ["kafka"] to ["logger"] then I do not see this error and I see the messages coming in correctly.

The results of the api/1/partner_accounts/fleet_telemetry_errors call shows the following although I believe it is just an error for when my fleet-telemetry container was not running:

{
                "created_at": "2024-06-10T20:09:02.175200376Z",
                "error": "\"webconnection error: dial tcp myserverip:1000: connect: connection refused\" cm_type=stream",
                "error_name": "cloud_manager_error",
                "hostname": "mydomain.com",
                "name": "1ed6ad0b3a7d-43e3-9803-33b794e0bd15",
                "port": "1000",
                "txID": "6d388b75-2e23-4c8b-ac57-42aa46e69ab5",
                "vin": "myvin"
            }
agbpatro commented 3 weeks ago

looks like you are using a version before this was implemented. Please upgrade to the latest version of fleet telemetry

rileymd88 commented 3 weeks ago

Using the latest docker container resolved this issue. Thanks!