cockroachdb / cockroach

CockroachDB - the open source, cloud-native distributed SQL database.
https://www.cockroachlabs.com
Other
29.51k stars 3.7k forks source link

logging: implement character limit on emitted logs #126236

Open sudomateo opened 1 week ago

sudomateo commented 1 week ago

Describe the problem

CockroachDB does not have a character limit on emitted log lines which prevents logging sinks like Fluent Bit from receiving the entire log line as a single event.

Here's one such long log event, redacted where the majority of the characters occur.

{
  "tag": "cockroach.telemetry",
  "channel_numeric": 12,
  "channel": "TELEMETRY",
  "timestamp": "1719264767.684544406",
  "cluster_id": "0aca882a-4a60-4b86-8b19-dae7a612d826",
  "tenant_id": 10,
  "tenant_name": "cluster-10",
  "instance_id": 1,
  "version": "v24.1.1",
  "severity_numeric": 1,
  "severity": "INFO",
  "goroutine": 261006378,
  "file": "util/log/event_log.go",
  "line": 32,
  "entry_counter": 4517475,
  "redactable": 1,
  "tags": {
    "n": "sql1",
    "peer": "‹10.0.5.245:51142›",
    "client": "34.173.198.153:35120",
    "hostssl": "",
    "user": "‹qualification-workload›"
  },
  "event": {
    "Timestamp": 1719264767676765008,
    "EventType": "sampled_query",
    "Statement": "UPSERT INTO \"\".\"\".kv(k, v) VALUES ($1, $2), ($3, $4), ($5, $6), ($7, $8), ($9, $10), ($11, $12), ($13, $14), ($15, $16), ($17, $18), ($19, $20)",
    "Tag": "INSERT",
    "User": "‹qualification-workload›",
    "ApplicationName": "kv",
    "PlaceholderValues": [
      "‹-1869857217101111026›",
      "REDACTED LONG STRING",
      "‹4777978783825699802›",
      "REDACTED LONG STRING",
      "‹603098788185010468›",
      "REDACTED LONG STRING",
      "‹1326509987868430009›",
      "REDACTED LONG STRING",
      "‹8234914939766267954›",
      "REDACTED LONG STRING",
      "‹433961672278954175›",
      "REDACTED LONG STRING",
      "‹462940093285206272›",
      "REDACTED LONG STRING",
      "‹8779261853781541214›",
      "REDACTED LONG STRING",
      "‹-2391050638372379906›",
      "REDACTED LONG STRING",
      "‹248356239080483531›",
      "REDACTED LONG STRING",
    ],
    "ExecMode": "exec",
    "NumRows": 10,
    "Age": 6.832211,
    "StmtPosInTxn": 1,
    "SkippedQueries": 6,
    "CostEstimate": 0.12,
    "Distribution": "local",
    "PlanGist": "AgIUBAUGIuYBAQ==",
    "SessionID": "17dc0e2864b9cad50000000000000001",
    "Database": "kv",
    "StatementID": "17dc0e32783991f80000000000000001",
    "TransactionID": "7d765250-befb-448e-871d-1efb45c19857",
    "StatementFingerprintID": 11236207118300501284,
    "RowsWritten": 10,
    "ServiceLatencyNanos": 6620499,
    "OverheadLatencyNanos": 2040,
    "RunLatencyNanos": 5937285,
    "PlanLatencyNanos": 681174,
    "SchemaChangerMode": "none"
  }
}

This log event was 206584 bytes in size due to the large event.PlaceholderValues values.

> wc -c /tmp/large-log-line.log
  206584 /tmp/large-log-line.log

To Reproduce

This was observed on a staging CockroachDB Cloud cluster (0aca882a-4a60-4b86-8b19-dae7a612d826).

Expected behavior

CockroachDB should impose a limit on the number of character a given log line can be, preferring to split a single log event across multiple events instead.

Additional data / screenshots

N/A

Environment:

Additional context

What was the impact?

The Fluent Bit sink CockroachDB was configured to send logs to was unable to receive logs over TCP, preventing logs from being sent to a centralized system.

[2024/06/26 01:46:11] [ warn] [input:tcp:tcp-cockroach-unredacted] invalid JSON message, skipping
[2024/06/26 01:46:12] [ warn] [input:tcp:tcp-cockroach-unredacted] fd=63 incoming data exceeds 'Buffer_Size' (32 KB)

Jira issue: CRDB-39806

blathers-crl[bot] commented 1 week ago

Hi @sudomateo, please add branch-* labels to identify which branch(es) this C-bug affects.

:owl: Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.