elastic / elasticsearch

Free and Open Source, Distributed, RESTful Search Engine
https://www.elastic.co/products/elasticsearch
Other
1.19k stars 24.85k forks source link

Legitimate warning header leads to StackOverflowError #95972

Closed danielmitterdorfer closed 1 year ago

danielmitterdorfer commented 1 year ago

Elasticsearch Version

main

Installed Plugins

No response

Java Version

bundled

OS Version

Linux kore 6.1.25-1-MANJARO #1 SMP PREEMPT_DYNAMIC Thu Apr 20 13:48:36 UTC 2023 x86_64 GNU/Linux

Problem Description

While working on #95666, I encountered a test failure because the following warning message was emitted by MetadataIndexTemplateService:

legacy template [global] has index patterns [*] matching patterns from existing composable templates [.deprecation-indexing-template,.fleet-file-data,.fleet-files,.ml-anomalies-,.ml-notifications-000002,.ml-state,.ml-stats,.monitoring-beats-mb,.monitoring-ent-search-mb,.monitoring-es-mb,.monitoring-kibana-mb,.monitoring-logstash-mb,.profiling-ilm-lock,.slm-history,.watch-history-16,behavioral_analytics-events-default,ilm-history,logs,metrics,profiling-events,profiling-executables,profiling-metrics,profiling-returnpads-private,profiling-sq-executables,profiling-sq-leafframes,profiling-stackframes,profiling-stacktraces,profiling-symbols,synthetics] with patterns (.deprecation-indexing-template => [.logs-deprecation.*],.fleet-file-data => [.fleet-file-data-*-*],.fleet-files => [.fleet-files-*-*],.ml-anomalies- => [.ml-anomalies-*],.ml-notifications-000002 => [.ml-notifications-000002],.ml-state => [.ml-state*],.ml-stats => [.ml-stats-*],.monitoring-beats-mb => [.monitoring-beats-8-*],.monitoring-ent-search-mb => [.monitoring-ent-search-8-*],.monitoring-es-mb => [.monitoring-es-8-*],.monitoring-kibana-mb => [.monitoring-kibana-8-*],.monitoring-logstash-mb => [.monitoring-logstash-8-*],.profiling-ilm-lock => [.profiling-ilm-lock*],.slm-history => [.slm-history-5*],.watch-history-16 => [.watcher-history-16*],behavioral_analytics-events-default => [behavioral_analytics-events-*],ilm-history => [ilm-history-5*],logs => [logs-*-*],metrics => [metrics-*-*],profiling-events => [profiling-events*],profiling-executables => [profiling-executables*],profiling-metrics => [profiling-metrics*],profiling-returnpads-private => [.profiling-returnpads-private*],profiling-sq-executables => [.profiling-sq-executables*],profiling-sq-leafframes => [.profiling-sq-leafframes*],profiling-stackframes => [profiling-stackframes*],profiling-stacktraces => [profiling-stacktraces*],profiling-symbols => [.profiling-symbols*],synthetics => [synthetics-*-*]); this template [global] may be ignored in favor of a composable template at index creation time

This happened here:

https://github.com/elastic/elasticsearch/blob/86f37160ae3c24c4dcaaf0b41637136275a9fdca/server/src/main/java/org/elasticsearch/cluster/metadata/MetadataIndexTemplateService.java#L1059-L1060

and attempting to emit a warning header lead to:

» [2023-05-10T07:36:15,606][ERROR][o.e.b.ElasticsearchUncaughtExceptionHandler] [v8.9.0-remote-0] fatal error in thread [elasticsearch[v8.9.0-remote-0][masterService#updateTask][T#1]], exiting java.lang.StackOverflowError
»       at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4828)
»       at java.base/java.util.regex.Pattern$Loop.match(Pattern.java:4937)
»       at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:4859)
»       at java.base/java.util.regex.Pattern$BranchConn.match(Pattern.java:4737)
»       at java.base/java.util.regex.Pattern$BmpCharProperty.match(Pattern.java:3993)
»       at java.base/java.util.regex.Pattern$Branch.match(Pattern.java:4773)
»       at java.base/java.util.regex.Pattern$GroupHead.match(Pattern.java:4828)
»       at java.base/java.util.regex.Pattern$Loop.match(Pattern.java:4937)
»       at java.base/java.util.regex.Pattern$GroupTail.match(Pattern.java:4859)
[...]

The StackOverflowError is caused by the following assertion:

https://github.com/elastic/elasticsearch/blob/86f37160ae3c24c4dcaaf0b41637136275a9fdca/server/src/main/java/org/elasticsearch/common/logging/HeaderWarning.java#L333

(the following line is probably also problematic)

Steps to Reproduce

This is a test case that reproduces the issue:

public class HeaderWarningTests extends ESTestCase {
    public void testAddComplexWarning() {
        final int maxWarningHeaderCount = 2;
        Settings settings = Settings.builder().put("http.max_warning_header_count", maxWarningHeaderCount).build();
        ThreadContext threadContext = new ThreadContext(settings);
        final Set<ThreadContext> threadContexts = Collections.singleton(threadContext);
        HeaderWarning.addWarning(
            threadContexts,
            "legacy template [global] has index patterns [*] matching patterns from existing composable templates " +
                "[.deprecation-indexing-template,.fleet-file-data,.fleet-files,.ml-anomalies-,.ml-notifications-000002,.ml-state," +
                ".ml-stats,.monitoring-beats-mb,.monitoring-ent-search-mb,.monitoring-es-mb,.monitoring-kibana-mb," +
                ".monitoring-logstash-mb,.profiling-ilm-lock,.slm-history,.watch-history-16,behavioral_analytics-events-default," +
                "ilm-history,logs,metrics,profiling-events,profiling-executables,profiling-metrics,profiling-returnpads-private," +
                "profiling-sq-executables,profiling-sq-leafframes,profiling-stackframes,profiling-stacktraces," +
                "profiling-symbols,synthetics] with patterns (.deprecation-indexing-template => [.logs-deprecation.*]," +
                ".fleet-file-data => [.fleet-file-data-*-*],.fleet-files => [.fleet-files-*-*],.ml-anomalies- => [.ml-anomalies-*]," +
                ".ml-notifications-000002 => [.ml-notifications-000002],.ml-state => [.ml-state*],.ml-stats => [.ml-stats-*]," +
                ".monitoring-beats-mb => [.monitoring-beats-8-*],.monitoring-ent-search-mb => [.monitoring-ent-search-8-*]," +
                ".monitoring-es-mb => [.monitoring-es-8-*],.monitoring-kibana-mb => [.monitoring-kibana-8-*]," +
                ".monitoring-logstash-mb => [.monitoring-logstash-8-*],.profiling-ilm-lock => [.profiling-ilm-lock*]," +
                ".slm-history => [.slm-history-5*],.watch-history-16 => [.watcher-history-16*]," +
                "behavioral_analytics-events-default => [behavioral_analytics-events-*],ilm-history => [ilm-history-5*]," +
                "logs => [logs-*-*],metrics => [metrics-*-*],profiling-events => [profiling-events*],profiling-executables => " +
                "[profiling-executables*],profiling-metrics => [profiling-metrics*],profiling-returnpads-private => " +
                "[.profiling-returnpads-private*],profiling-sq-executables => [.profiling-sq-executables*]," +
                "profiling-sq-leafframes => [.profiling-sq-leafframes*],profiling-stackframes => [profiling-stackframes*]," +
                "profiling-stacktraces => [profiling-stacktraces*],profiling-symbols => [.profiling-symbols*],synthetics => " +
                "[synthetics-*-*]); this template [global] may be ignored in favor of a composable template at index creation time"
        );
        final Map<String, List<String>> responseHeaders = threadContext.getResponseHeaders();

        assertThat(responseHeaders.size(), equalTo(1));
        final List<String> responses = responseHeaders.get("Warning");
        assertThat(responses, hasSize(1));
        assertThat(responses.get(0), warningValueMatcher);
        assertThat(responses.get(0), containsString("\"legacy template [global] has index patterns"));
        assertThat(responses.get(0), containsString(Integer.toString(299)));
    }
}

Logs (if relevant)

No response

elasticsearchmachine commented 1 year ago

Pinging @elastic/es-core-infra (Team:Core/Infra)

pgomulka commented 1 year ago

turns out this is operating system/jvm related (not to mention the regex being very complex) On macos with jdk17/20 I am not getting a failure The stack sizes are:

java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep ThreadStackSize
     intx CompilerThreadStackSize                  = 2048                                   {pd product} {default}
     intx ThreadStackSize                          = 2048                                   {pd product} {default}
     intx VMThreadStackSize                        = 2048                                   {pd product} {default}
openjdk version “20” 2023-03-21
OpenJDK Runtime Environment (build 20+36-2344)
OpenJDK 64-Bit Server VM (build 20+36-2344, mixed mode, sharing)

on ubuntu Linux przemek-gomulka 4.15.0-55-generic #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

java -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal -version | grep ThreadStackSize
   intx CompilerThreadStackSize         = 1024                  {pd product} {default}
   intx ThreadStackSize             = 1024                  {pd product} {default}
   intx VMThreadStackSize            = 1024                  {pd product} {default}
java version “17.0.5” 2022-10-18 LTS
Java(TM) SE Runtime Environment (build 17.0.5+9-LTS-191)
Java HotSpot(TM) 64-Bit Server VM (build 17.0.5+9-LTS-191, mixed mode, sharing)

I was able to reproduce the issue on Macos when setting ThreadStackSize in the intellij runner. Somehow editing the gradle.properties file did not reproduce/affect

org.gradle.jvmargs=-XX:+HeapDumpOnOutOfMemoryError -Xss1m  -XX:ThreadStackSize=1024 -XX:VMThreadStackSize=1024 --add-exports jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED --add-exports jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED --add-exports jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED --add-exports jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED --add-exports jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED
pgomulka commented 1 year ago

From what I see when debugging this the pattern is making the stack size relative to the parsed message length. The relations is pretty much linear - N message length results in N stack size. It is the value part that is causing this excesive recursion

        "\"((?:\t| |!|[\\x23-\\x5B]|[\\x5D-\\x7E]|[\\x80-\\xFF]|\\\\|\\\\\")*)\"( " + // quoted warning value, captured
public static final Pattern WARNING_HEADER_PATTERN = Pattern.compile("299 " + // log level code
        "Elasticsearch-" + // warn agent
        "\\d+\\.\\d+\\.\\d+(?:-(?:alpha|beta|rc)\\d+)?(?:-SNAPSHOT)?-" + // warn agent
        "(?:[a-f0-9]{7}(?:[a-f0-9]{33})?|unknown) " + // warn agent
        "\"((?:\t| |!|[\\x23-\\x5B]|[\\x5D-\\x7E]|[\\x80-\\xFF]|\\\\|\\\\\")*)\"( " + // quoted warning value, captured
        // quoted RFC 1123 date format
        "\"" + // opening quote
        "(?:Mon|Tue|Wed|Thu|Fri|Sat|Sun), " + // weekday
        "\\d{2} " + // 2-digit day
        "(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) " + // month
        "\\d{4} " + // 4-digit year
        "\\d{2}:\\d{2}:\\d{2} " + // (two-digit hour):(two-digit minute):(two-digit second)
        "GMT" + // GMT
        "\")?"); // closing quote (optional, since an older version can still send a warn-date)