elastic / elastic-transport-js

Transport classes and utilities shared among Node.js Elastic client libraries
Apache License 2.0
4 stars 24 forks source link

Invalid handling of array in headers.warning #95

Closed shai-orca closed 2 weeks ago

shai-orca commented 2 months ago

🐛 Bug Report

In some situations, Elasticsearch version 8.13 returns multiple warning headers. When there are indeed multiple warning headers, they become an array, and so the code processing them here , crashes, because it expects this.headers.warning to be a string, not an array of strings.

Example of such Error:

To Reproduce

Simple setup with the ES SDK to reproduce the behavior:

import { Client } from "@elastic/elasticsearch";

async function crashElasticSDK() {
  const payload = {
    index: "non_existent_index",
    body: {
      query: {
        bool: {
          filter: [
            {
              exists: {
                field: "route",
              },
            },
          ],
        },
      },
    },
    size: 20,
    _source: {
      include: ["*"],
      exclude: ["*description_embedding*"],
    },
  };
  const client = new Client({
    node: "https://localhost:9200",
    maxRetries: 10,
    requestTimeout: 3 * 60 * 1000,
  });

  /*
  Below line will throw an error:
  TypeError: this.headers.warning.split is not a function
    at Object.get warnings (/Users/shaialon/github/general/sonar-ai/services/sonar-ai-schema-serving/node_modules/@elastic/transport/src/Transport.ts:363:12)
    at Function.entries (<anonymous>)
    at doRedact (/Users/shaialon/github/general/sonar-ai/services/sonar-ai-schema-serving/node_modules/@elastic/transport/src/security.ts:42:12)
    at redactObject (/Users/shaialon/github/general/sonar-ai/services/sonar-ai-schema-serving/node_modules/@elastic/transport/src/security.ts:38:10)
    at redactDiagnostic (/Users/shaialon/github/general/sonar-ai/services/sonar-ai-schema-serving/node_modules/@elastic/transport/src/security.ts:89:14)
    at ResponseError (/Users/shaialon/github/general/sonar-ai/services/sonar-ai-schema-serving/node_modules/@elastic/transport/src/errors.ts:154:33)
    at SniffingTransport.request (/Users/shaialon/github/general/sonar-ai/services/sonar-ai-schema-serving/node_modules/@elastic/transport/src/Transport.ts:553:17)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at Client.SearchApi (/Users/shaialon/github/general/sonar-ai/services/sonar-ai-schema-serving/node_modules/@elastic/elasticsearch/src/api/api/search.ts:89:10)
    at crashElasticSDK (/Users/shaialon/github/general/sonar-ai/services/sonar-ai-schema-serving/src/services/elasticsearch_client.ts:33:17)
  */
  try {
    const value = await client.search(payload);
    // This line will never be reached
    console.log("this line will never be reached");
    return value;
  } catch (err) {
    console.error(err);
  }
}

crashElasticSDK();

The code above creates the warnings from the elasticsearch headers:

Warning: 299 Elasticsearch-8.13.2-16cc90cd2d08a3147ce02b07e50894bc060a4cbf "Deprecated field [include] used, expected [includes] instead"
Warning: 299 Elasticsearch-8.13.2-16cc90cd2d08a3147ce02b07e50894bc060a4cbf "Deprecated field [exclude] used, expected [excludes] instead"

You can see the exact ES response that crashes the transport library with this request to a cluster:

curl -vvv https://elastic:YOUR_PASS@localhost:9200/non_existent_index/_search -H 'connection: keep-alive' -H 'user-agent: elasticsearch-js/8.13.1 (darwin 23.3.0-arm64; Node.js 20.10.0; Transport 8.4.1)' -H 'x-elastic-client-meta: es=8.13.1,js=20.10.0,t=8.4.1,hc=20.10.0' -H 'content-type: application/vnd.elasticsearch+json; compatible-with=8' -H 'accept: application/vnd.elasticsearch+json; compatible-with=8' -d '{"query":{"bool":{"filter":[{"exists":{"field":"route"}}]}},"size":20,"_source":{"include":["*"],"exclude":["*description_embedding*"]}}' -vvv -k  | jq

Which returns:

* Connected to localhost (::1) port 9200
* ALPN: curl offers h2,http/1.1
* (304) (OUT), TLS handshake, Client hello (1):
} [314 bytes data]
* (304) (IN), TLS handshake, Server hello (2):
{ [122 bytes data]
* (304) (IN), TLS handshake, Unknown (8):
{ [32 bytes data]
* (304) (IN), TLS handshake, Certificate (11):
{ [2792 bytes data]
* (304) (IN), TLS handshake, CERT verify (15):
{ [520 bytes data]
* (304) (IN), TLS handshake, Finished (20):
{ [52 bytes data]
* (304) (OUT), TLS handshake, Finished (20):
} [52 bytes data]
* SSL connection using TLSv1.3 / AEAD-AES256-GCM-SHA384
* ALPN: server did not agree on a protocol. Uses default.
* Server certificate:
*  subject: CN=6f984185b670
*  start date: Apr 17 14:45:59 2024 GMT
*  expire date: Apr 17 14:45:59 2026 GMT
*  issuer: CN=Elasticsearch security auto-configuration HTTP CA
*  SSL certificate verify result: self signed certificate in certificate chain (19), continuing anyway.
* using HTTP/1.x
* Server auth using Basic with user 'elastic'
> POST /non_existent_index/_search HTTP/1.1
> Host: localhost:9200
> Authorization: Basic ZWxhc3RpYzo2X3pQQ1VJbzZjPVFjWm5TSVNCZw==
> connection: keep-alive
> user-agent: elasticsearch-js/8.13.1 (darwin 23.3.0-arm64; Node.js 20.10.0; Transport 8.4.1)
> x-elastic-client-meta: es=8.13.1,js=20.10.0,t=8.4.1,hc=20.10.0
> content-type: application/vnd.elasticsearch+json; compatible-with=8
> accept: application/vnd.elasticsearch+json; compatible-with=8
> Content-Length: 136
>
} [136 bytes data]
< HTTP/1.1 404 Not Found
< X-elastic-product: Elasticsearch
< Warning: 299 Elasticsearch-8.13.2-16cc90cd2d08a3147ce02b07e50894bc060a4cbf "Deprecated field [include] used, expected [includes] instead"
< Warning: 299 Elasticsearch-8.13.2-16cc90cd2d08a3147ce02b07e50894bc060a4cbf "Deprecated field [exclude] used, expected [excludes] instead"
< content-type: application/vnd.elasticsearch+json;compatible-with=8
< content-length: 437
<
{ [437 bytes data]
100   573  100   437  100   136   3455   1075 --:--:-- --:--:-- --:--:--  4547
* Connection #0 to host localhost left intact
{
  "error": {
    "root_cause": [
      {
        "type": "index_not_found_exception",
        "reason": "no such index [non_existent_index]",
        "resource.type": "index_or_alias",
        "resource.id": "non_existent_index",
        "index_uuid": "_na_",
        "index": "non_existent_index"
      }
    ],
    "type": "index_not_found_exception",
    "reason": "no such index [non_existent_index]",
    "resource.type": "index_or_alias",
    "resource.id": "non_existent_index",
    "index_uuid": "_na_",
    "index": "non_existent_index"
  },
  "status": 404
}

Expected behavior

This line should expect an array in some cases as this.headers.warning, in which case different logic is needed so the code does not crash.

The code below should be resilient to getting an array under this.headers.warning

 get warnings () {
        if (this.headers?.warning == null) {
          return null
        }
        return this.headers.warning
          .split(/(?!\B"[^"]*),(?![^"]*"\B)/)
          .filter((warning) => warning.match(/^\d\d\d Elasticsearch-/))
      }

Your Environment

Suggested solution:

get warnings() {
  // Check if warning header is absent
  if (this.headers?.warning == null) {
    return null;
  }

  // Ensure handling both single string and array of strings
  const warnings = Array.isArray(this.headers.warning) ? this.headers.warning : [this.headers.warning];

  // Split only if necessary, then flatten and filter
  return warnings
    .flatMap(warning => warning.split(/(?!\B"[^"]*),(?![^"]*"\B)/)) // Split each warning entry on commas that aren't enclosed in quotes
    .filter(warning => warning.match(/^\d\d\d Elasticsearch-/)); // Filter messages with specific format
}

Updates the get warnings() function to robustly handle both scenarios: when this.headers.warning is a string and when it is an array of strings. The key changes are:

  1. Unified Handling of String and Array: The function first checks if this.headers.warning is an array. If not, it wraps the string into an array. This ensures that the subsequent operations, which are designed for arrays, can proceed uniformly regardless of the input type.

  2. Safe Splitting and Filtering: Using .flatMap(), the function splits each warning entry on commas not enclosed in quotes and flattens the results. It then filters these entries to include only those that match the expected Elasticsearch warning format (three digits followed by the keyword Elasticsearch-).

shai-orca commented 1 month ago

Hi @JoshMock , is there an ETA for a fix here? Thanks!

JoshMock commented 1 month ago

No ETA yet, as I have a few other bugs ahead of it in priority. PRs are always welcome, though!

JoshMock commented 2 weeks ago

v8.6.1 has been released with this fix.