opensearch-project / OpenSearch

🔎 Open source distributed and RESTful search engine.
https://opensearch.org/docs/latest/opensearch/index/
Apache License 2.0
9.47k stars 1.74k forks source link

[BUG] Deeply nested aggregations are not terminable by any mechanism and cause Out of Memory errors in data nodes. #15413

Open Pigueiras2 opened 2 weeks ago

Pigueiras2 commented 2 weeks ago

Describe the bug

We have a cluster with 12 data nodes and 31 GB reserved for the JVM. We were experiencing sporadic Out of Memory errors and managed to isolate the issue to some dashboards that were using nested aggregations with arbitrarily large sizes. We tried different approaches to terminate these client searches before they could crash some of the nodes in the cluster, but none of them worked (as described below).

The query running behind the scenes in Grafana/Dashboards was something similar to:

POST /<index>/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "metadata.timestamp": {
              "gte": 1723737975837,
              "lte": 1724342775837,
              "format": "epoch_millis"
            }
          }
        },
        {
          "query_string": {
            "analyze_wildcard": true,
            "query": "..."
          }
        }
      ]
    }
  },
  "aggs": {
    "3": {
      "terms": {
        "field": "data.dst_experiment_site",
        "size": 500000000,  # the arbitrary big size
        "order": {
          "_key": "desc"
        },
        "min_doc_count": 1
      },
      "aggs": {
        "4": {
          "terms": {
            "field": "data.dst_hostname",
            "size": 500000000,  # the arbitrary big size
            "order": {
              "_key": "desc"
            },
            "min_doc_count": 1
          },
          "aggs": {
            "5": {
              "terms": {
                "field": "data.metric_name",
                "size": 500000000,  # the arbitrary big size
                "order": {
                  "_key": "desc"
                },
                "min_doc_count": 1
              },
              "aggs": {
                "2": {
                  "date_histogram": {
                    "interval": "5m",  # this one is also very small and would create a lot of buckets
                    "field": "metadata.timestamp",
                    "min_doc_count": 1,
                    "extended_bounds": {
                      "min": 1723737975837,
                      "max": 1724342775837
                    },
                    "format": "epoch_millis"
                  },
                  "aggs": {
                    "1": {
                      "max": {
                        "field": "data.status_code"
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

We tried the following settings in our cluster:

GET /_cluster/settings
...
    "indices.breaker.request.limit": "40%",
    "indices.breaker.request.overhead": "1.5",
    "indices.breaker.total.limit": "70%",
    "search.default_search_timeout": "3s",
    "search.cancel_after_time_interval": "3s",
    "search.max_buckets": 65535,
    "search.low_level_cancellation": true,
    "search_backpressure": {
      "node_duress": {
        "heap_threshold": 0.5,
        "num_successive_breaches": 1
      }
GET /_tasks?actions=*search&detailed'"
...
"action": "indices:data/read/search",
"start_time_in_millis": 1724420308855,
"running_time_in_nanos": 12680749959,
"cancellable": true,
"cancelled": true,
"cancellation_time_millis": 1724420311870 <--- this is 3s after the start_time_in_millis and it never gets killed
...

For example, it runs for 2-3 minutes before crashing the data nodes:

GET /_cat/tasks?v
...
indices:data/read/search                     RldgtOhvQU69uOSumtdnRA:48608   -                            transport 1724421475326 13:57:55  50.6s    XXX.XXXX.129.208 XXXX-monit-backup1_client5
indices:data/read/search[phase/query]        le60EDYsQB-tyYYjXC8nYw:2830    RldgtOhvQU69uOSumtdnRA:48608 transport 1724421475349 13:57:55  50.6s    XXX.XXXX.128.25  XXXX-monit-backup1_data4
indices:data/read/search[phase/query]        AQ3e1uc9S1W-Hv6fx1NYLA:3607    RldgtOhvQU69uOSumtdnRA:48608 transport 1724421475350 13:57:55  50.6s    XXXX.XXX.129.208 XXXX-monit-backup1_data3 

If you try to kill the tasks manually with _tasks/node:task/_cancel the cluster simply ignores it.

...
[2024-08-22T15:56:12,212][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osabackup101-monit-backup1_data2] Recording memory usage: 64%
...
[2024-08-22T15:56:16,416][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osabackup101-monit-backup1_data2] Recording memory usage: 76%
...
[2024-08-22T15:56:18,418][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osabackup101-monit-backup1_data2] Recording memory usage: 82%
[2024-08-22T15:56:18,690][DEBUG][o.o.s.b.t.HeapUsageTracker] [osabackup101-monit-backup1_data2] heap usage not dominated by search requests [0/4992899481]

-----> backpressure killed tasks, didn't make a difference here
[2024-08-22T15:56:18,692][WARN ][o.o.s.b.SearchBackpressureService] [osabackup101-monit-backup1_data2] [enforced mode] cancelling task [2269] due to high resource consumption [cpu usage exceeded [1.6m >= 15s], elapsed time exceeded [1.7m >= 30s]]
[2024-08-22T15:56:18,693][WARN ][o.o.s.b.SearchBackpressureService] [osabackup101-monit-backup1_data2] [enforced mode] cancelling task [2270] due to high resource consumption 
[elapsed time exceeded [1.7m >= 30s]]

[2024-08-22T15:56:18,996][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osabackup101-monit-backup1_data2] Recording memory usage: 83%
...
[2024-08-22T15:56:22,699][DEBUG][o.o.s.b.t.HeapUsageTracker] [osabackup101-monit-backup1_data2] heap usage not dominated by search requests [0/4992899481]
[2024-08-22T15:56:22,998][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osabackup101-monit-backup1_data2] Recording memory usage: 94%
[2024-08-22T15:56:23,399][INFO ][o.o.i.b.HierarchyCircuitBreakerService] [osabackup101-monit-backup1_data2] attempting to trigger G1GC due to high heap usage [31819582928]
[2024-08-22T15:56:23,512][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osabackup101-monit-backup1_data2] Recording memory usage: 95%
...
[2024-08-22T15:56:24,513][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osabackup101-monit-backup1_data2] Recording memory usage: 99%
...
java.lang.OutOfMemoryError: Java heap space
java.lang.OutOfMemoryError: Java heap space

We've run out of ideas, so please let us know if there's something really missing from OpenSearch or if you have any other suggestions to try. We would appreciate it! 😄

Related component

Search:Aggregations

To Reproduce

  1. Create a query with big sizes and several levels of terms aggregation (with big cardinality)
  2. Wait for datanodes to OOM

Expected behavior

Additional Details

Plugins opensearch-alerting opensearch-anomaly-detection opensearch-asynchronous-search opensearch-cross-cluster-replication opensearch-custom-codecs opensearch-flow-framework opensearch-geospatial opensearch-index-management opensearch-job-scheduler opensearch-knn opensearch-ml opensearch-neural-search opensearch-notifications opensearch-notifications-core opensearch-observability opensearch-performance-analyzer opensearch-reports-scheduler opensearch-security opensearch-security-analytics opensearch-skills opensearch-sql repository-s3

Host/Environment:

sandeshkr419 commented 2 weeks ago

Search Meetup Triage: @jainankitk / @sgup432 Do you have some context on this?

@Pigueiras2 Did you also try specify total bucket counts (reduce than defaults) as well?

Pigueiras commented 2 weeks ago

Did you also try specify total bucket counts (reduce than defaults) as well?

Do you mean changing search.max_buckets? I tried setting it to 10k, but I didn’t notice any difference. According to this comment, that limit might not be reached because it is only taken into account in the reduce phase. If the aggregation is small enough and OpenSearch can compute it, I see an error about my query failing because it hit the maximum number of buckets. I also found this issue, which made me think there is a breaker to protect against such queries, but I haven’t seen it being triggered in my cluster.

kkhatua commented 2 weeks ago

@Pigueiras / @Pigueiras2 Can you capture and provide a couple of histograms of the heap? Ideally, Search Backpressure should have caught it, unless there is an allocation being made elsewhere.

Pigueiras commented 1 week ago

@kkhatua

I send the query to my cluster at Sat Aug 31 11:45:57 PM CEST 2024 and one of the nodes crashed at 23:48:55,614 and the other one at 23:49:00,778.

This is what heap reported by _node/stats looked like (in addition of the reported cpu + backpressure stats). Also last panel reports the memory consumed by the search tasks (what is reported by the tasks API)

image

Logs of one of the datanodes before crashing (I see zero entries about "o.o.s.b.SearchBackpressureService" in the cluster logs):

…
[2024-08-31T23:48:50,959][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osbbackup101-monit-backup1_data2] Recording memory usage: 99%
[2024-08-31T23:48:50,959][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33031019058/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [602/602b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=31500/30.7kb]
[2024-08-31T23:48:50,959][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [cluster:monitor/nodes/info[n]] would be [33031028310/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [9854/9.6kb], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=31500/30.7kb]
[2024-08-31T23:48:50,960][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33031019082/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [626/626b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=31500/30.7kb]
[2024-08-31T23:48:50,959][DEBUG][o.o.n.r.t.AverageCpuUsageTracker] [osbbackup101-monit-backup1_data2] Recording cpu usage: 38%
[2024-08-31T23:48:50,959][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [cluster:monitor/nodes/stats[n]] would be [33031025000/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [6544/6.3kb], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=31500/30.7kb]
[2024-08-31T23:48:50,959][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [cluster:monitor/nodes/info[n]] would be [33031028310/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [9854/9.6kb], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=31500/30.7kb]
[2024-08-31T23:48:50,959][WARN ][o.o.m.j.JvmGcMonitorService] [osbbackup101-monit-backup1_data2] [gc][667] overhead, spent [891ms] collecting in the last [1.1s]
[2024-08-31T23:48:50,959][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33031019042/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [586/586b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=31500/30.7kb]
[2024-08-31T23:48:50,959][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [cluster:monitor/tasks/lists[n]] would be [33031018534/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [78/78b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=31500/30.7kb]
[2024-08-31T23:48:50,959][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33031019082/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [626/626b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=31500/30.7kb]
[2024-08-31T23:48:50,960][DEBUG][o.o.t.TransportService   ] [osbbackup101-monit-backup1_data2] Action: internal:coordination/fault_detection/leader_check
[2024-08-31T23:48:50,961][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [cluster:monitor/nodes/stats[n]] would be [33031028832/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [10376/10.1kb], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=12546/12.2kb]
[2024-08-31T23:48:50,961][DEBUG][o.o.t.TaskManager        ] [osbbackup101-monit-backup1_data2] Refreshing resource stats for Task: 6169
[2024-08-31T23:48:50,962][DEBUG][o.o.c.t.r.ResourceUsageInfo] [osbbackup101-monit-backup1_data2] updated resource usage info [resource_stats=[memory_in_bytes], old_end_value=30732949552, new_end_value=30924956520]
[2024-08-31T23:48:50,962][DEBUG][o.o.c.t.r.ResourceUsageInfo] [osbbackup101-monit-backup1_data2] updated resource usage info [resource_stats=[cpu_time_in_nanos], old_end_value=166516903330, new_end_value=166730650043]
[2024-08-31T23:48:50,962][DEBUG][o.o.s.b.t.HeapUsageTracker] [osbbackup101-monit-backup1_data2] heap usage not dominated by search requests [0/4992899481]
[2024-08-31T23:48:50,970][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:monitor/stats[n]] would be [33031061228/30.7gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33031018456/30.7gb], new bytes reserved: [42772/41.7kb], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=44942/43.8kb]
[2024-08-31T23:48:51,056][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33081350690/30.8gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33081350104/30.8gb], new bytes reserved: [586/586b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=2756/2.6kb]
[2024-08-31T23:48:51,141][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33165236810/30.8gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33165236184/30.8gb], new bytes reserved: [626/626b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=2796/2.7kb]
[2024-08-31T23:48:51,953][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osbbackup101-monit-backup1_data2] Recording memory usage: 99%
[2024-08-31T23:48:51,954][DEBUG][o.o.n.r.t.AverageCpuUsageTracker] [osbbackup101-monit-backup1_data2] Recording cpu usage: 44%
[2024-08-31T23:48:51,954][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33197990626/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [626/626b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=15560/15.1kb]
[2024-08-31T23:48:51,954][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33197990598/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [598/598b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=15560/15.1kb]
[2024-08-31T23:48:51,954][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33197990590/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [590/590b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=15560/15.1kb]
[2024-08-31T23:48:51,954][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33197990602/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [602/602b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=15560/15.1kb]
[2024-08-31T23:48:51,954][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33197990598/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [598/598b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=15560/15.1kb]
[2024-08-31T23:48:51,954][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [cluster:monitor/nodes/stats[n]] would be [33198000376/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [10376/10.1kb], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=15560/15.1kb]
[2024-08-31T23:48:51,956][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33197990586/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [586/586b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=3382/3.3kb]
[2024-08-31T23:48:51,956][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [indices:admin/seq_no/retention_lease_background_sync[r]] would be [33197990626/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [626/626b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=3382/3.3kb]
[2024-08-31T23:48:51,960][WARN ][o.o.m.j.JvmGcMonitorService] [osbbackup101-monit-backup1_data2] [gc][668] overhead, spent [803ms] collecting in the last [1s]
[2024-08-31T23:48:51,961][DEBUG][o.o.t.TransportService   ] [osbbackup101-monit-backup1_data2] Action: internal:coordination/fault_detection/leader_check
[2024-08-31T23:48:51,963][DEBUG][o.o.t.TaskManager        ] [osbbackup101-monit-backup1_data2] Refreshing resource stats for Task: 6169
[2024-08-31T23:48:51,963][DEBUG][o.o.c.t.r.ResourceUsageInfo] [osbbackup101-monit-backup1_data2] updated resource usage info [resource_stats=[memory_in_bytes], old_end_value=30924956520, new_end_value=31084986240]
[2024-08-31T23:48:51,963][DEBUG][o.o.c.t.r.ResourceUsageInfo] [osbbackup101-monit-backup1_data2] updated resource usage info [resource_stats=[cpu_time_in_nanos], old_end_value=166730650043, new_end_value=166928829066]
[2024-08-31T23:48:51,963][DEBUG][o.o.s.b.t.HeapUsageTracker] [osbbackup101-monit-backup1_data2] heap usage not dominated by search requests [0/4992899481]
[2024-08-31T23:48:51,967][DEBUG][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] [parent] Data too large, data for [cluster:monitor/tasks/lists[n]] would be [33197990096/30.9gb], which is larger than the limit of [31621696716/29.4gb], real usage: [33197990000/30.9gb], new bytes reserved: [96/96b], usages [request=90146464/85.9mb, fielddata=103263/100.8kb, in_flight_requests=2266/2.2kb]
[2024-08-31T23:48:55,517][INFO ][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] attempting to trigger G1GC due to high heap usage [31729041960]
[2024-08-31T23:48:55,517][DEBUG][o.o.n.r.t.AverageMemoryUsageTracker] [osbbackup101-monit-backup1_data2] Recording memory usage: 95%
[2024-08-31T23:48:55,609][DEBUG][o.o.n.r.t.AverageCpuUsageTracker] [osbbackup101-monit-backup1_data2] Recording cpu usage: 59%
[2024-08-31T23:48:55,609][WARN ][o.o.m.j.JvmGcMonitorService] [osbbackup101-monit-backup1_data2] [gc][669] overhead, spent [3.5s] collecting in the last [3.5s]
[2024-08-31T23:48:55,610][DEBUG][o.o.t.TransportService   ] [osbbackup101-monit-backup1_data2] Action: internal:coordination/fault_detection/leader_check
[2024-08-31T23:48:55,610][INFO ][o.o.i.b.HierarchyCircuitBreakerService] [osbbackup101-monit-backup1_data2] GC did bring memory usage down, before [31729041960], after [832549984], allocations [1], duration [93]
[2024-08-31T23:48:55,612][DEBUG][o.o.t.TaskManager        ] [osbbackup101-monit-backup1_data2] Task execution finished on thread. Task: 6169, Thread: 301
[2024-08-31T23:48:55,612][DEBUG][o.o.c.t.r.ResourceUsageInfo] [osbbackup101-monit-backup1_data2] updated resource usage info [resource_stats=[memory_in_bytes], old_end_value=31084986240, new_end_value=31093069072]
[2024-08-31T23:48:55,612][DEBUG][o.o.c.t.r.ResourceUsageInfo] [osbbackup101-monit-backup1_data2] updated resource usage info [resource_stats=[cpu_time_in_nanos], old_end_value=166928829066, new_end_value=166944765687]
[2024-08-31T23:48:55,614][ERROR][o.o.b.OpenSearchUncaughtExceptionHandler] [osbbackup101-monit-backup1_data2] fatal error in thread [opensearch[osbbackup101-monit-backup1_data2][search][T#3]], exiting
java.lang.OutOfMemoryError: Java heap space
    at java.base/java.util.ArrayList.<init>(ArrayList.java:156) ~[?:?]
    at org.opensearch.search.aggregations.bucket.BucketsAggregator.buildAggregationsForVariableBuckets(BucketsAggregator.java:411) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.histogram.DateHistogramAggregator.buildAggregations(DateHistogramAggregator.java:208) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:220) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForAllBuckets(BucketsAggregator.java:286) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.access$400(GlobalOrdinalsStringTermsAggregator.java:90) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:900) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:847) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:762) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:316) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:220) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForAllBuckets(BucketsAggregator.java:286) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.access$400(GlobalOrdinalsStringTermsAggregator.java:90) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:900) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:847) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:762) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:316) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:220) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForAllBuckets(BucketsAggregator.java:286) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.access$400(GlobalOrdinalsStringTermsAggregator.java:90) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:900) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$StandardTermsResults.buildSubAggs(GlobalOrdinalsStringTermsAggregator.java:847) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator$ResultStrategy.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:762) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.bucket.terms.GlobalOrdinalsStringTermsAggregator.buildAggregations(GlobalOrdinalsStringTermsAggregator.java:316) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.Aggregator.buildTopLevel(Aggregator.java:205) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.aggregations.BucketCollectorProcessor.processPostCollection(BucketCollectorProcessor.java:78) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:286) ~[opensearch-2.15.0.jar:2.15.0]
    at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:552) ~[lucene-core-9.10.0.jar:9.10.0 695c0ac84508438302cd346a812cfa2fdc5a10df - 2024-02-14 16:48:06]
    at org.opensearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:355) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWithCollector(QueryPhase.java:462) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWithCollector(QueryPhase.java:450) ~[opensearch-2.15.0.jar:2.15.0]
    at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWith(QueryPhase.java:432) ~[opensearch-2.15.0.jar:2.15.0]
        at org.opensearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:286) ~[opensearch-2.15.0.jar:2.15.0]
        at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:552) ~[lucene-core-9.10.0.jar:9.10.0 695c0ac84508438302cd346a812cfa2fdc5a10df - 2024-02-14 16:48:06]
        at org.opensearch.search.query.QueryPhase.searchWithCollector(QueryPhase.java:355) ~[opensearch-2.15.0.jar:2.15.0]
        at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWithCollector(QueryPhase.java:462) ~[opensearch-2.15.0.jar:2.15.0]
        at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWithCollector(QueryPhase.java:450) ~[opensearch-2.15.0.jar:2.15.0]
        at org.opensearch.search.query.QueryPhase$DefaultQueryPhaseSearcher.searchWith(QueryPhase.java:432) ~[opensearch-2.15.0.jar:2.15.0]

These are the current settings of the cluster (in case you see anything wrong with them):

GET /_cluster/settings
{
  "persistent": {
    "plugins": {
      "index_state_management": {
        "metadata_migration": {
          "status": "1"
        },
        "template_migration": {
          "control": "-1"
        }
      }
    },
    "search": {
      "default_search_timeout": "5s",
      "max_buckets": "10000",
      "cancel_after_time_interval": "5s"
    },
    "search_backpressure": {
      "mode": "enforced",
      "node_duress": {
        "cpu_threshold": "0.9",
        "heap_threshold": "0.75",
        "num_successive_breaches": "3"
      },
      "search_shard_task": {
        "elapsed_time_millis_threshold": "30000",
        "heap_variance": "2.0",
        "heap_percent_threshold": "0.10",
        "cancellation_burst": "10.0",
        "cpu_time_millis_threshold": "15000",
        "cancellation_ratio": "0.1",
        "cancellation_rate": "0.003",
        "total_heap_percent_threshold": "0.15",
        "heap_moving_average_window_size": "100"
      },
      "search_task": {
        "elapsed_time_millis_threshold": "45000",
        "heap_variance": "2.0",
        "heap_percent_threshold": "0.10",
        "cancellation_burst": "5.0",
        "cpu_time_millis_threshold": "30000",
        "cancellation_ratio": "0.1",
        "cancellation_rate": "0.003",
        "total_heap_percent_threshold": "0.15",
        "heap_moving_average_window_size": "100"
      }
    }
  },
  "transient": {
    "search": {
      "default_search_timeout": "3s",
      "max_buckets": "65535",
      "low_level_cancellation": "true",
      "cancel_after_time_interval": "3s"
    },
    "search_backpressure": {
      "mode": "enforced",
      "node_duress": {
        "heap_threshold": "0.7",
        "num_successive_breaches": "1"
      }
    }
  }
}

These are what was reported by the _cat/tasks during the query execution (both data nodes crashed)

indices:data/read/search                     7QMJAF3gTqmuFEXk4advYA:3878353  -                              transport 1725140757616 21:45:57 2.7m         oscbackup101-monit-backup1_client5
indices:data/read/search[phase/query]        krMR7qVATei7PSDxMVHm1Q:18970    7QMJAF3gTqmuFEXk4advYA:3878353 transport 1725140757631 21:45:57 2.7m         osabackup101-monit-backup1_data1
indices:data/read/search[phase/query]        QQ-NoPkXSgOhN2CRJcA_IQ:6169     7QMJAF3gTqmuFEXk4advYA:3878353 transport 1725140757639 21:45:57 2.7m         osbbackup101-monit-backup1_data2

If you want me to add any other extra information or test any other thing let me know.

kkhatua commented 1 week ago

This is odd. There might be nodestats for search backpressure that you can also share

curl -X GET "localhost:9200/_nodes/stats/search_backpressure?pretty&human"

The output would be something like this for both, search (coordinator) and shard tasks...

     "search_backpressure" : {
        "search_task" : {
          "resource_tracker_stats" : {
            "elapsed_time_tracker" : {
              "cancellation_count" : 0,
              "current_max" : "0s",
              "current_max_millis" : 0,
              "current_avg" : "0s",
              "current_avg_millis" : 0
            },
            "heap_usage_tracker" : {
              "cancellation_count" : 0,
              "current_max" : "0b",
              "current_max_bytes" : 0,
              "current_avg" : "0b",
              "current_avg_bytes" : 0,
              "rolling_avg" : "728.8kb",
              "rolling_avg_bytes" : 746360
            },
            "cpu_usage_tracker" : {
              "cancellation_count" : 0,
              "current_max" : "0s",
              "current_max_millis" : 0,
              "current_avg" : "0s",
              "current_avg_millis" : 0
            }
          },
          "cancellation_stats" : {
            "cancellation_count" : 0,
            "cancelled_task_percentage" : 0.0,
            "cancellation_limit_reached_count" : 0,
            "current_cancellation_eligible_tasks_count" : 0
          }
        },
        "search_shard_task" : {
...
        }
     }

One possibility is that the task cancellation itself is self-throttling, and you will need to tinker with those values to avoid throttling. (Ref: https://opensearch.org/docs/2.15/tuning-your-cluster/availability-and-recovery/search-backpressure/ ) In the meantime, if there is an allocation being made outside of the tasks or something that the resource tracking framework isn't able to measure through the tasks, we might need to inspect some dumps of the histogram.

Could you capture and share the histogram dumps??

The multiple samples will reveal which objects are rapidly growing in count and hogging the memory. The failed allocations at the time of the OOME is more in line with the available heap memory that is exhausted and not the cause.

Also, I'm assuming you are not running any painless scripts.

Pigueiras commented 1 week ago

First of all, thanks a lot for taking the time to answer. It's really appreciated 😄

This is odd. There might be nodestats for search backpressure that you can also share curl -X GET "localhost:9200/_nodes/stats/search_backpressure?pretty&human"

Yes, I’m plotting search_task.heap_usage.search(_shard)_task.current_avg_bytes here. I believe these are the relevant metrics in this case (if you want another metric let me know).

image

One possibility is that the task cancellation itself is self-throttling, and you will need to tinker with those values to avoid throttling. (Ref: https://opensearch.org/docs/2.15/tuning-your-cluster/availability-and-recovery/search-backpressure/ )

Does the way search backpressure cancels a task differ from me calling _tasks/<id>/cancel directly or using search.cancel_after_time_interval? I'm trying to cancel it a couple of seconds after sending it with no effect. Can throttling really affect it so much that in the ~3 minutes the request takes to take a node into OOME, the cancel task gets "ignored" during this period?

About tinkering the values of backpressure, I've also tried with:

...
  "transient": {
    "search": {
       ...
    },
    "search_backpressure": {
      "mode": "enforced",
      "node_duress": {
        "cpu_threshold": "0.1",
        "heap_threshold": "0.3",
        "num_successive_breaches": "1"
      }
    }
  }

And I don't even see the message about the search backpressure service trying to kill a task (the node should be under duress with those values for about 30/40 seconds and the task should be killed either for time or heap usage).

Could you capture and share the histogram dumps??

This one is right before crashing. Does it provide the information you were looking for?

 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:        113733    30414943824  [Ljava.lang.Object; (java.base@21.0.3)
   2:         14525     1464952168  [Ljdk.internal.vm.FillerElement; (java.base@21.0.3)
   3:       4072653      220573688  [B (java.base@21.0.3)
   4:        469391      140360848  [J (java.base@21.0.3)
   5:       1997093       79883720  org.opensearch.search.aggregations.metrics.InternalMax
   6:       2674789       64194936  java.lang.String (java.base@21.0.3)
   7:       1997093       63906976  org.opensearch.search.aggregations.bucket.BucketsAggregator$1
   8:       1885023       60320736  java.util.HashMap$Node (java.base@21.0.3)
   9:       1214525       48581000  java.util.TreeMap$Entry (java.base@21.0.3)
  10:       1997094       47930256  org.opensearch.search.aggregations.InternalAggregations
  11:        718325       28733000  org.opensearch.search.aggregations.bucket.histogram.InternalDateHistogram$Bucket
  12:        343770       24751440  org.apache.lucene.index.FieldInfo
  13:        383223       24526272  org.apache.lucene.store.MemorySegmentIndexInput$SingleSegmentImpl
  14:        880327       21127848  org.apache.lucene.util.BytesRef
  15:        134288       19757208  [Ljava.util.HashMap$Node; (java.base@21.0.3)
  16:        216532       15590304  org.apache.lucene.codecs.lucene90.blocktree.FieldReader
  17:        383223       15328920  jdk.internal.foreign.MappedMemorySegmentImpl (java.base@21.0.3)
  18:        279573       13419504  java.util.HashMap (java.base@21.0.3)
  19:        257387       12354576  java.util.TreeMap (java.base@21.0.3)
  20:        109995       12319440  org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$SortedNumericEntry
  21:        106744       11101376  org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$TermsDictEntry
  22:        119676       10531488  org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$NumericEntry
  23:        216535       10393680  org.apache.lucene.util.fst.FST$FSTMetadata
  24:        321251       10280032  java.util.Collections$UnmodifiableMap (java.base@21.0.3)
  25:        383223        9197352  [Ljava.lang.foreign.MemorySegment; (java.base@21.0.3)
  26:        222402        8896080  org.apache.lucene.util.packed.DirectMonotonicReader$Meta
  27:             2        8032936  [Lorg.opensearch.search.aggregations.InternalAggregation;
  28:             1        7988392  [Lorg.opensearch.search.aggregations.InternalAggregations;
  29:        112958        7229312  org.apache.lucene.util.bkd.BKDReader
  30:        221655        7092960  java.util.concurrent.atomic.LongAdder (java.base@21.0.3)
  31:        216532        6929024  org.apache.lucene.util.fst.OffHeapFSTStore
  32:         20819        6242176  [I (java.base@21.0.3)
  33:        222410        5575304  [F (java.base@21.0.3)
  34:        216535        5196840  org.apache.lucene.util.fst.FST
  35:        104606        5021088  org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$NormsEntry
  36:        112958        4518320  org.apache.lucene.util.bkd.BKDConfig
  37:         29772        3520144  java.lang.Class (java.base@21.0.3)
  38:        138383        3321192  org.opensearch.common.util.concurrent.ReleasableLock
  39:         69060        3314880  java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync (java.base@21.0.3)
  40:        102032        3265024  java.util.concurrent.ConcurrentHashMap$Node (java.base@21.0.3)
  41:          8742        2912880  [Lorg.apache.lucene.index.FieldInfo;
  42:         26206        2725424  org.apache.lucene.index.SegmentCommitInfo
  43:         32220        2595776  [Ljava.util.WeakHashMap$Entry; (java.base@21.0.3)
  44:        106744        2561856  org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$SortedSetEntry
  45:        106542        2557008  org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$SortedEntry
  46:         68352        2187264  org.opensearch.common.cache.Cache$CacheSegment
  47:         62112        1987584  org.apache.lucene.codecs.lucene90.Lucene90CompoundReader$FileEntry
  48:         14350        1758416  [C (java.base@21.0.3)
  49:         52263        1672416  java.util.concurrent.locks.ReentrantLock$NonfairSync (java.base@21.0.3)
  50:         68829        1651896  java.util.concurrent.locks.ReentrantReadWriteLock (java.base@21.0.3)
  51:         68352        1640448  org.opensearch.common.cache.Cache$CacheSegment$SegmentStats
  52:         28412        1591072  org.apache.lucene.document.FieldType
  53:         32170        1544160  java.util.WeakHashMap (java.base@21.0.3)
  54:         86110        1377760  java.lang.Object (java.base@21.0.3)
  55:         32209        1288360  java.util.LinkedHashMap$Entry (java.base@21.0.3)
  56:         50491        1211784  java.util.ArrayList (java.base@21.0.3)
  57:         70371        1125936  java.lang.ThreadLocal (java.base@21.0.3)
  58:         69064        1105024  java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock (java.base@21.0.3)
  59:         69064        1105024  java.util.concurrent.locks.ReentrantReadWriteLock$Sync$ThreadLocalHoldCounter (java.base@21.0.3)
  60:         69064        1105024  java.util.concurrent.locks.ReentrantReadWriteLock$WriteLock (java.base@21.0.3)
  61:         32264        1032448  java.lang.ref.ReferenceQueue (java.base@21.0.3)
  62:          2361        1002640  [Ljava.util.concurrent.ConcurrentHashMap$Node; (java.base@21.0.3)
  63:         19197         921456  java.lang.invoke.MemberName (java.base@21.0.3)
  64:         56745         907920  java.util.concurrent.atomic.AtomicInteger (java.base@21.0.3)
  65:         27571         882272  org.apache.lucene.util.Version
  66:         13251         848064  java.util.LinkedHashMap (java.base@21.0.3)
  67:         13062         835968  org.apache.lucene.index.SegmentInfo
  68:         52219         835504  java.util.concurrent.locks.ReentrantLock (java.base@21.0.3)
  69:         32869         788856  java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject (java.base@21.0.3)
  70:         48434         774944  java.util.HashSet (java.base@21.0.3)
  71:         10499         755928  io.netty.buffer.PoolSubpage
  72:         18755         750200  java.util.WeakHashMap$Entry (java.base@21.0.3)
  73:         29772         714528  java.util.Collections$UnmodifiableRandomAccessList (java.base@21.0.3)
  74:          8000         704000  java.lang.reflect.Method (java.base@21.0.3)
  75:         12553         702968  org.opensearch.search.aggregations.bucket.terms.StringTerms$Bucket
  76:         20610         659520  org.opensearch.common.settings.Setting$Updater
  77:         13355         641040  org.apache.lucene.index.LeafReaderContext
  78:         38732         619712  java.util.HashMap$Values (java.base@21.0.3)
  79:         25612         614688  org.apache.lucene.util.packed.DirectReader$DirectPackedReader8
  80:         35411         566576  java.util.HashMap$KeySet (java.base@21.0.3)
  81:         22697         544728  org.apache.lucene.util.packed.DirectReader$DirectPackedReader20
  82:         33719         539504  java.util.Collections$UnmodifiableCollection (java.base@21.0.3)
  83:         12386         495440  java.lang.invoke.MethodType (java.base@21.0.3)
  84:         30159         482544  java.util.TreeMap$EntrySet (java.base@21.0.3)
  85:         19851         476424  org.apache.lucene.util.FileDeleter$RefCount
  86:         11404         456160  org.opensearch.index.analysis.NamedAnalyzer
  87:          9468         454464  org.opensearch.painless.lookup.PainlessClass
  88:         13998         447936  java.util.ImmutableCollections$Map1 (java.base@21.0.3)
  89:          8779         444256  [Lorg.apache.lucene.util.LongValues;
  90:          8770         420960  org.apache.lucene.util.packed.DirectMonotonicReader
  91:         13104         419328  java.util.ImmutableCollections$MapN (java.base@21.0.3)
  92:         10329         413160  java.io.FileDescriptor (java.base@21.0.3)
  93:         12809         409888  java.lang.invoke.MethodType$ConcurrentWeakInternSet$WeakEntry (java.base@21.0.3)
  94:         16546         397104  org.apache.lucene.index.FieldInfos$FieldDimensions
  95:         16546         397104  org.apache.lucene.index.FieldInfos$FieldVectorProperties
  96:         14231         393152  [Ljava.lang.Class; (java.base@21.0.3)
  97:         16140         387360  org.apache.logging.log4j.message.ReusableMessageFactory
  98:          4356         383328  org.apache.lucene.codecs.lucene90.compressing.FieldsIndexReader
  99:          4356         383328  org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsReader
 100:         11941         382112  java.lang.invoke.LambdaForm$Name (java.base@21.0.3)
 101:          5186         373392  org.opensearch.index.mapper.TextFieldMapper
 102:          2733         371688  org.opensearch.cluster.metadata.IndexMetadata
 103:         15016         360384  java.util.Collections$SingletonList (java.base@21.0.3)
 104:          5610         359040  java.util.concurrent.ConcurrentHashMap (java.base@21.0.3)
 105:          4437         354960  org.apache.lucene.index.SegmentReader
 106:          5487         351168  org.opensearch.cluster.routing.ShardRouting
 107:           640         348160  io.netty.util.internal.shaded.org.jctools.queues.atomic.MpscAtomicArrayQueue
 108:          7149         343152  sun.nio.ch.FileChannelImpl$DefaultUnmapper (java.base@21.0.3)
 109:          5263         336832  org.opensearch.index.mapper.KeywordFieldMapper
 110:         10452         334464  org.opensearch.index.mapper.TextSearchInfo
 111:          1253         331312  [Z (java.base@21.0.3)
 112:         13800         331200  java.util.Collections$SynchronizedSet (java.base@21.0.3)
 113:          6894         316544  [Ljava.lang.String; (java.base@21.0.3)
 114:         13180         316320  org.apache.lucene.util.CloseableThreadLocal
 115:         13164         315936  java.lang.invoke.ResolvedMethodName (java.base@21.0.3)
 116:          4356         313632  org.apache.lucene.index.SegmentCoreReaders
 117:         12579         301896  org.apache.lucene.analysis.DelegatingAnalyzerWrapper$DelegatingReuseStrategy
 118:         12539         300936  java.util.concurrent.atomic.AtomicLong (java.base@21.0.3)
 119:         12525         300600  org.opensearch.common.Explicit
 120:          9367         299744  org.opensearch.common.collect.CopyOnWriteHashMap$InnerNode
 121:         17951         287216  org.opensearch.index.mapper.FieldMapper$MultiFields
 122:           267         277680  [Lorg.opensearch.common.cache.Cache$CacheSegment;
 123:         11256         270144  java.util.Arrays$ArrayList (java.base@21.0.3)
 124:          4814         269584  org.opensearch.index.mapper.NumberFieldMapper
 125:         11132         267168  java.util.Collections$SetFromMap (java.base@21.0.3)
 126:         16621         265936  java.util.HashMap$EntrySet (java.base@21.0.3)
 127:         10821         259704  org.apache.lucene.util.packed.DirectReader$DirectPackedReader16
 128:          5263         252624  org.opensearch.index.mapper.KeywordFieldMapper$KeywordFieldType
 129:          5186         248928  org.opensearch.index.mapper.TextFieldMapper$TextFieldType
 130:          4356         243936  org.apache.lucene.codecs.lucene90.compressing.Lucene90CompressingStoredFieldsReader$BlockState
 131:          4356         243936  org.apache.lucene.index.ReadersAndUpdates
 132:          3805         243520  org.opensearch.search.aggregations.bucket.histogram.InternalDateHistogram
 133:          6063         242520  java.lang.invoke.DirectMethodHandle (java.base@21.0.3)
 134:            55         234392  [S (java.base@21.0.3)
 135:           633         233440  [[C (java.base@21.0.3)
 136:          4814         231072  org.opensearch.index.mapper.NumberFieldMapper$NumberFieldType
 137:          2740         219200  org.opensearch.cluster.routing.IndexShardRoutingTable
 138:          4371         209808  org.apache.lucene.index.FieldInfos
 139:          4368         209664  java.lang.invoke.DirectMethodHandle$Constructor (java.base@21.0.3)
 140:          4357         209136  org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer
 141:          4350         208800  org.apache.lucene.index.PendingSoftDeletes
 142:          3246         207744  java.security.Provider$Service (java.base@21.0.3)
 143:          6350         203200  org.opensearch.common.logging.PrefixLogger
 144:         12640         202240  java.util.Collections$UnmodifiableSet (java.base@21.0.3)
 145:           458         199032  [Ljava.nio.ByteBuffer; (java.base@21.0.3)
 146:          8109         194616  org.apache.logging.log4j.message.DefaultFlowMessageFactory
 147:          4485         179400  java.lang.invoke.BoundMethodHandle$Species_L (java.base@21.0.3)
 148:          4356         174240  org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsReader
 149:          4350         174000  org.opensearch.common.lucene.index.OpenSearchLeafReader
 150:          4328         173120  java.lang.ref.SoftReference (java.base@21.0.3)
 151:          4322         172880  org.apache.lucene.codecs.lucene90.Lucene90NormsProducer
 152:          3581         171888  jdk.internal.ref.CleanerImpl$PhantomCleanableRef (java.base@21.0.3)
 153:          7149         171576  jdk.internal.foreign.SharedSession (java.base@21.0.3)
 154:          7149         171576  sun.nio.ch.FileChannelImpl$1 (java.base@21.0.3)
 155:          7050         169200  org.apache.lucene.util.packed.DirectReader$DirectPackedReader12
 156:          5276         168832  java.util.Hashtable$Entry (java.base@21.0.3)
 157:          7007         168168  java.util.LinkedList$Node (java.base@21.0.3)
 158:          6997         167928  java.security.Provider$ServiceKey (java.base@21.0.3)
 159:          4183         167320  java.lang.invoke.BoundMethodHandle$Species_LL (java.base@21.0.3)
 160:         10150         162400  java.util.WeakHashMap$KeySet (java.base@21.0.3)
 161:          9947         159152  java.util.LinkedHashSet (java.base@21.0.3)
 162:          2173         156456  java.lang.reflect.Field (java.base@21.0.3)
 163:          6443         154632  org.opensearch.core.index.Index
 164:          6154         147696  java.util.concurrent.CopyOnWriteArrayList (java.base@21.0.3)
 165:          9074         145184  java.util.concurrent.atomic.AtomicReference (java.base@21.0.3)
 166:          9028         144448  org.apache.lucene.index.IndexReader$CacheKey
 167:          5988         143712  org.opensearch.common.recycler.DequeRecycler$DV
 168:          5988         143712  org.opensearch.common.recycler.Recyclers$1$1
 169:          8856         141696  org.opensearch.common.metrics.CounterMetric
 170:          2198         140672  java.net.URL (java.base@21.0.3)
 171:          1941         139752  java.lang.reflect.Constructor (java.base@21.0.3)
 172:          4357         139424  org.apache.lucene.index.SegmentDocValues$1
 173:          4356         139392  org.apache.lucene.index.LeafMetaData
 174:          4356         139392  org.apache.lucene.index.PendingDeletes
 175:          4356         139392  org.apache.lucene.index.SegmentCoreReaders$1
 176:          4356         139392  org.apache.lucene.index.SegmentCoreReaders$2
 177:          4350         139200  org.apache.lucene.codecs.lucene90.Lucene90PointsReader
 178:          4350         139200  org.apache.lucene.index.SegmentReadState
 179:          4324         138368  org.apache.lucene.backward_codecs.lucene90.Lucene90PostingsReader
 180:          2112         135168  sun.nio.ch.FileChannelImpl (java.base@21.0.3)
 181:          2380         133280  org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput
 182:          5487         131688  org.opensearch.cluster.routing.AllocationId
 183:          8211         131376  org.opensearch.cluster.routing.RotationShardShuffler
 184:          2301         128856  java.nio.HeapByteBuffer (java.base@21.0.3)
 185:          3889         124448  org.apache.lucene.codecs.lucene90.Lucene90CompoundReader
 186:          3860         123520  java.lang.ThreadLocal$ThreadLocalMap$Entry (java.base@21.0.3)
 187:          1603         121784  [Ljava.lang.ref.SoftReference; (java.base@21.0.3)
 188:          3043         121720  jdk.nio.zipfs.ZipFileSystem$IndexNode (jdk.zipfs@21.0.3)
 189:          1374         120912  java.util.regex.Pattern (java.base@21.0.3)
 190:          4911         117864  org.opensearch.common.inject.Key
 191:          7248         115968  org.opensearch.common.SetOnce
 192:          7149         114384  jdk.internal.foreign.MemorySessionImpl$1 (java.base@21.0.3)
 193:          7149         114384  jdk.internal.foreign.SharedSession$SharedResourceList (java.base@21.0.3)
 194:          2818         112720  org.opensearch.painless.lookup.PainlessMethod
 195:          2811         112440  org.opensearch.painless.spi.WhitelistMethod
 196:          3357         107424  sun.nio.fs.UnixPath (java.base@21.0.3)
 197:          4437         106488  org.apache.lucene.index.SegmentReader$1
 198:          4357         104568  org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsReader
 199:          4356         104544  org.apache.lucene.codecs.lucene90.LZ4WithPresetDictCompressionMode$LZ4WithPresetDictDecompressor
 200:          4356         104544  org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader
 201:          4356         104544  org.apache.lucene.index.SegmentCoreReaders$3
 202:          3260         104320  org.opensearch.common.inject.spi.Dependency
 203:          3253         104096  java.util.Collections$UnmodifiableSortedMap (java.base@21.0.3)
 204:          3250         104000  org.opensearch.common.settings.Settings
 205:          4168         100032  java.util.regex.Pattern$Slice (java.base@21.0.3)
 206:          4063          97512  org.opensearch.common.inject.TypeLiteral
 207:          3969          95256  java.lang.invoke.LambdaForm$NamedFunction (java.base@21.0.3)
 208:          2913          93216  org.tartarus.snowball.Among
 209:          1212          91840  [Ljava.lang.invoke.LambdaForm$Name; (java.base@21.0.3)
 210:           603          91656  sun.security.ssl.SSLSessionImpl (java.base@21.0.3)
 211:          1888          90624  org.apache.logging.log4j.message.ReusableParameterizedMessage
 212:          3776          90624  org.apache.lucene.index.ApproximatePriorityQueue
 213:          1250          90000  java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask (java.base@21.0.3)
 214:           656          89216  sun.nio.fs.UnixFileAttributes (java.base@21.0.3)
 215:          2731          87392  org.opensearch.cluster.routing.IndexRoutingTable
 216:          5377          86032  java.util.concurrent.atomic.AtomicBoolean (java.base@21.0.3)
 217:          3535          84840  org.opensearch.common.metrics.MeanMetric
 218:          3526          84624  java.util.regex.Pattern$GroupTail (java.base@21.0.3)
 219:          3519          84456  java.util.regex.Pattern$GroupHead (java.base@21.0.3)
 220:          2590          82880  java.util.LinkedList (java.base@21.0.3)
 221:           940          82720  org.opensearch.index.codec.PerFieldMappingPostingFormatCodec
 222:          1641          78768  org.opensearch.common.inject.internal.InstanceBindingImpl
 223:          1396          78176  io.netty.channel.DefaultChannelHandlerContext
 224:          3218          77232  org.opensearch.common.compress.CompressedXContent
 225:          1423          76552  [Lorg.opensearch.search.aggregations.bucket.terms.StringTerms$Bucket;
 226:          4778          76448  java.lang.Integer (java.base@21.0.3)
 227:          3180          76320  org.opensearch.common.unit.TimeValue
 228:          3167          76008  sun.reflect.generics.tree.SimpleClassTypeSignature (java.base@21.0.3)
 229:          3140          75360  java.util.regex.Pattern$BmpCharProperty (java.base@21.0.3)
 230:          4676          74816  java.util.concurrent.CopyOnWriteArraySet (java.base@21.0.3)
 231:          4651          74416  java.util.Collections$UnmodifiableMap$UnmodifiableEntrySet (java.base@21.0.3)
 232:          1536          73728  io.netty.buffer.PoolChunkList
 233:           237          72048  org.opensearch.index.IndexSettings
 234:          2985          71640  org.opensearch.core.index.shard.ShardId
 235:          4437          70992  org.apache.lucene.index.SegmentReader$2
 236:          1771          70840  java.io.FilePermission (java.base@21.0.3)
 237:           679          70616  java.util.jar.JarFile$JarFileEntry (java.base@21.0.3)
 238:          4356          69696  org.apache.lucene.index.SegmentDocValues
 239:           235          69560  org.opensearch.index.shard.IndexShard
 240:          1708          68320  org.apache.logging.log4j.core.Logger$PrivateConfig
 241:          2827          67848  org.opensearch.common.inject.SingleParameterInjector
 242:           942          67824  org.apache.lucene.index.SegmentInfos
 243:          2768          66432  java.util.ArrayDeque (java.base@21.0.3)
 244:          2732          65568  org.opensearch.cluster.metadata.MappingMetadata
 245:          2731          65544  org.opensearch.cluster.metadata.IndexAbstraction$Index
 246:          3167          61256  [Lsun.reflect.generics.tree.TypeArgument; (java.base@21.0.3)
 247:          3709          59344  org.opensearch.common.SetOnce$Wrapper
 248:          1476          59040  org.opensearch.common.settings.Setting
 249:           281          58256  [Lio.netty.buffer.PoolSubpage;
 250:          1820          58240  java.util.RegularEnumSet (java.base@21.0.3)
 251:          1207          57936  java.lang.invoke.LambdaForm (java.base@21.0.3)
 252:          1798          57536  java.lang.Package (java.base@21.0.3)
 253:          1410          56400  [Lorg.opensearch.index.mapper.MetadataFieldMapper;
 254:          1145          54960  java.lang.StackTraceElement (java.base@21.0.3)
 255:          1708          54656  org.apache.logging.log4j.core.Logger
 256:          1337          53480  java.lang.Package$VersionInfo (java.base@21.0.3)
 257:          2200          52800  javax.crypto.spec.SecretKeySpec (java.base@21.0.3)
 258:          3290          52640  java.util.TreeMap$KeySet (java.base@21.0.3)
 259:           656          52480  java.util.zip.ZipFile$Source (java.base@21.0.3)
 260:           798          51072  javax.crypto.Cipher (java.base@21.0.3)
 261:           236          50976  org.apache.lucene.index.IndexWriter
 262:           235          50760  org.opensearch.index.engine.InternalEngine
 263:          2112          50688  sun.nio.ch.NativeThreadSet (java.base@21.0.3)
 264:          3154          50464  sun.reflect.generics.tree.ClassTypeSignature (java.base@21.0.3)
 265:          1205          48200  java.security.CodeSource (java.base@21.0.3)
 266:          1484          47488  org.opensearch.common.collect.CopyOnWriteHashMap
 267:          1965          47160  java.util.regex.Pattern$BmpCharPropertyGreedy (java.base@21.0.3)
 268:           235          47000  org.opensearch.index.IndexService
 269:          1894          45456  org.apache.logging.log4j.message.ParameterFormatter$MessagePatternAnalysis
 270:           705          45120  java.util.zip.Inflater (java.base@21.0.3)
 271:           399          44688  io.netty.handler.ssl.SslHandler
 272:           399          44688  org.opensearch.transport.CopyBytesSocketChannel
 273:           399          44688  sun.nio.ch.SocketChannelImpl (java.base@21.0.3)
 274:          1103          44120  org.opensearch.ingest.useragent.UserAgentParser$UserAgentSubpattern
 275:           787          44072  java.lang.invoke.DirectMethodHandle$StaticAccessor (java.base@21.0.3)
 276:          1372          43904  java.util.regex.Pattern$Branch (java.base@21.0.3)
 277:          1372          42888  [Ljava.util.regex.Pattern$Node; (java.base@21.0.3)
 278:          2677          42832  java.util.regex.Pattern$$Lambda/0x80000002a (java.base@21.0.3)
 279:          1725          41400  java.security.Provider$UString (java.base@21.0.3)
 280:          1289          41248  org.opensearch.common.util.concurrent.ThreadContext$$Lambda/0x00007f2150542d40
 281:           107          39856  [Ljava.lang.ThreadLocal$ThreadLocalMap$Entry; (java.base@21.0.3)
 282:          1635          39240  java.lang.RuntimePermission (java.base@21.0.3)
 283:           187          39224  [Ljava.lang.invoke.MethodHandle; (java.base@21.0.3)
 284:          1215          38880  java.lang.ref.WeakReference (java.base@21.0.3)
 285:           960          38400  org.apache.lucene.codecs.lucene90.Lucene90TermVectorsFormat
 286:           663          37128  java.io.FileCleanable (java.base@21.0.3)
 287:           656          36736  java.util.jar.JarFile (java.base@21.0.3)
 288:            22          36224  [Ljava.util.Hashtable$Entry; (java.base@21.0.3)
 289:          1468          35232  org.apache.lucene.util.packed.DirectReader$DirectPackedReader4
 290:           399          35112  sun.security.ssl.TransportContext (java.base@21.0.3)
 291:           871          34840  com.fasterxml.jackson.databind.introspect.AnnotatedMethod
 292:           859          34360  org.jcodings.unicode.UnicodeCodeRange
 293:           857          34280  org.opensearch.common.path.PathTrie$TrieNode
 294:           475          34200  org.apache.lucene.index.TieredMergePolicy
 295:           235          33840  org.opensearch.index.translog.TranslogWriter
 296:          2112          33792  sun.nio.ch.FileChannelImpl$Closer (java.base@21.0.3)
 297:          1051          33632  io.netty.util.internal.LongAdderCounter
 298:          1389          33336  org.opensearch.core.common.unit.ByteSizeValue
 299:             1          32792  [Lkotlinx.coroutines.scheduling.CoroutineScheduler$Worker;
 300:          1014          32448  org.apache.lucene.search.TermQuery
 301:          1003          32096  java.lang.invoke.MethodTypeForm (java.base@21.0.3)
 302:           236          32096  org.apache.lucene.index.DocumentsWriterFlushControl
 303:           235          31960  org.opensearch.index.seqno.ReplicationTracker
 304:           798          31920  io.netty.handler.ssl.SslHandler$LazyChannelPromise
 305:           399          31920  sun.security.ssl.SSLConfiguration (java.base@21.0.3)
 306:           656          31488  jdk.internal.loader.URLClassPath$JarLoader (java.base@21.0.3)
 307:          1301          31224  java.util.concurrent.CompletableFuture (java.base@21.0.3)
 308:          1293          31032  java.util.concurrent.Executors$RunnableAdapter (java.base@21.0.3)
 309:          1901          30416  org.apache.lucene.codecs.lucene90.Lucene90DocValuesFormat
 310:          1267          30408  java.util.concurrent.ConcurrentLinkedQueue$Node (java.base@21.0.3)
 311:          1265          30360  org.opensearch.cluster.node.DiscoveryNodeFilters
 312:           947          30304  org.apache.lucene.codecs.lucene99.Lucene99HnswVectorsFormat
 313:           754          30160  java.math.BigInteger (java.base@21.0.3)
 314:           628          30144  com.sun.crypto.provider.GaloisCounterMode$AESGCM (java.base@21.0.3)
 315:           235          30080  org.opensearch.index.engine.EngineConfig
 316:           470          30080  org.opensearch.index.mapper.RootObjectMapper
 317:          1877          30032  java.nio.channels.spi.AbstractInterruptibleChannel$1 (java.base@21.0.3)
 318:           748          29920  java.lang.invoke.DirectMethodHandle$Special (java.base@21.0.3)
 319:          1236          29664  org.opensearch.threadpool.ThreadPool$ThreadedRunnable
 320:          1222          29328  java.util.LinkedHashMap$LinkedValues (java.base@21.0.3)
 321:          1204          28896  org.apache.lucene.util.packed.DirectReader$DirectPackedReader24
 322:           399          28728  sun.security.ssl.SSLEngineOutputRecord (java.base@21.0.3)
 323:          1080          28464  [Ljava.lang.reflect.Type; (java.base@21.0.3)
 324:           236          28320  org.apache.lucene.index.IndexWriterConfig
 325:           235          28200  org.opensearch.index.engine.InternalEngine$EngineMergeScheduler
 326:           583          27984  sun.security.util.MemoryCache$SoftCacheEntry (java.base@21.0.3)
 327:           685          27400  io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
 328:           558          26784  org.apache.lucene.analysis.tokenattributes.PackedTokenAttributeImpl
 329:           256          26624  io.netty.buffer.PoolArena$HeapArena
 330:          1642          26272  org.opensearch.common.inject.util.Providers$1
 331:          1641          26256  org.opensearch.common.inject.internal.InternalFactory$Instance
 332:            13          26240  [[J (java.base@21.0.3)
 333:           656          26240  java.io.RandomAccessFile (java.base@21.0.3)
 334:           544          26112  org.opensearch.index.mapper.ObjectMapper
 335:             8          26056  [Lorg.opensearch.common.recycler.Recycler$V;
 336:          1070          25680  org.opensearch.security.filter.SecurityRestFilter$AuthczRestHandler
 337:           401          25664  io.netty.channel.ChannelOutboundBuffer
 338:           401          25664  io.netty.channel.DefaultChannelPipeline$HeadContext
 339:           596          25624  [Ljava.security.ProtectionDomain; (java.base@21.0.3)
 340:           399          25536  io.netty.channel.socket.nio.NioSocketChannel$NioSocketChannelConfig
 341:           634          25360  java.lang.invoke.DirectMethodHandle$Interface (java.base@21.0.3)
 342:          1582          25312  com.fasterxml.jackson.databind.introspect.AnnotationMap
 343:           790          25280  org.opensearch.painless.lookup.PainlessField
 344:           790          25280  org.opensearch.painless.spi.WhitelistField
 345:          1042          25008  org.bouncycastle.asn1.ASN1ObjectIdentifier
 346:           623          24920  org.opensearch.transport.RequestHandlerRegistry
 347:           614          24560  java.security.AccessControlContext (java.base@21.0.3)
 348:          1017          24408  org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable
 349:          1523          24368  org.opensearch.common.logging.DeprecationLogger
 350:          1015          24360  org.apache.lucene.index.Term
 351:          1521          24336  org.opensearch.common.settings.Setting$SimpleKey
 352:           727          23264  java.util.concurrent.Semaphore$NonfairSync (java.base@21.0.3)
 353:           573          22920  java.util.IdentityHashMap (java.base@21.0.3)
 354:           954          22896  sun.reflect.annotation.AnnotationInvocationHandler (java.base@21.0.3)
 355:           475          22800  org.opensearch.common.util.MovingAverage
 356:           948          22752  org.apache.lucene.codecs.lucene99.Lucene99PostingsFormat
 357:           947          22728  org.apache.lucene.codecs.lucene99.Lucene99Codec$1
 358:           947          22728  org.apache.lucene.codecs.lucene99.Lucene99Codec$2
 359:           947          22728  org.apache.lucene.codecs.lucene99.Lucene99Codec$3
 360:           940          22560  org.opensearch.index.codec.fuzzy.FuzzySetParameters
 361:           470          22560  org.opensearch.index.mapper.DocumentMapper
 362:           705          22560  org.opensearch.index.mapper.MapperService$MapperAnalyzerWrapper
 363:           401          22456  io.netty.channel.DefaultChannelPipeline$TailContext
 364:           400          22400  io.netty.channel.FixedRecvByteBufAllocator$HandleImpl
 365:           399          22344  org.opensearch.transport.InboundPipeline
 366:           399          22344  sun.security.ssl.SSLCipher$T13GcmReadCipherGenerator$GcmReadCipher (java.base@21.0.3)
 367:           399          22344  sun.security.ssl.SSLCipher$T13GcmWriteCipherGenerator$GcmWriteCipher (java.base@21.0.3)
 368:           309          22248  org.apache.lucene.analysis.standard.StandardTokenizerImpl
 369:           238          22096  [Lorg.apache.lucene.index.LeafReader;
 370:           920          22080  java.util.concurrent.ConcurrentLinkedQueue (java.base@21.0.3)
 371:           237          21976  [Lorg.apache.lucene.index.IndexReaderContext;
 372:          1372          21952  java.util.regex.Pattern$BranchConn (java.base@21.0.3)
 373:           684          21888  javax.crypto.Cipher$Transform (java.base@21.0.3)
 374:           235          21736  [Lorg.apache.lucene.index.SegmentReader;
 375:           679          21728  java.util.zip.ZipFile$CleanableResource (java.base@21.0.3)
 376:           905          21720  org.opensearch.sql.expression.function.FunctionSignature
 377:           338          21632  sun.security.ssl.CipherSuite (java.base@21.0.3)
 378:           267          21360  org.opensearch.common.cache.Cache
 379:           890          21360  org.opensearch.common.inject.InternalFactoryToProviderAdapter
 380:           890          21360  org.opensearch.common.inject.ProviderToInternalFactoryAdapter
 381:           890          21360  org.opensearch.common.inject.Scopes$1$1
 382:           667          21344  java.io.File (java.base@21.0.3)
 383:           879          21096  java.util.regex.Pattern$SliceI (java.base@21.0.3)
 384:           876          21024  sun.security.provider.PolicyFile$PolicyEntry (java.base@21.0.3)
 385:           653          20896  sun.reflect.generics.repository.ClassRepository (java.base@21.0.3)
 386:           870          20880  java.lang.Class$AnnotationData (java.base@21.0.3)
 387:           652          20864  java.util.PropertyPermission (java.base@21.0.3)
 388:           651          20832  java.net.InetAddress$InetAddressHolder (java.base@21.0.3)
 389:           866          20784  org.opensearch.common.inject.multibindings.RealElement
 390:           235          20680  org.opensearch.index.SearchSlowLog
 391:           235          20680  org.opensearch.security.configuration.SecurityFlsDlsIndexSearcherWrapper
 392:           849          20376  sun.reflect.generics.factory.CoreReflectionFactory (java.base@21.0.3)
 393:           636          20352  org.opensearch.core.ParseField
 394:           628          20096  com.sun.crypto.provider.AESCrypt (java.base@21.0.3)
 395:           624          19968  io.netty.buffer.PoolThreadCache$SubPageMemoryRegionCache
 396:           831          19944  sun.reflect.generics.reflectiveObjects.ParameterizedTypeImpl (java.base@21.0.3)
 397:           475          19800  [Lorg.opensearch.common.inject.SingleParameterInjector;
 398:           309          19776  org.apache.lucene.analysis.standard.StandardTokenizer
 399:           810          19440  org.apache.logging.log4j.MarkerManager$Log4jMarker
 400:           809          19400  [Ljava.security.cert.X509Certificate; (java.base@21.0.3)
 401:          1210          19360  java.util.regex.Pattern$BitClass (java.base@21.0.3)
 402:           401          19248  io.netty.channel.AbstractChannel$CloseFuture
 403:           401          19248  io.netty.channel.DefaultChannelPipeline
 404:           802          19248  io.netty.channel.VoidChannelPromise
 405:           401          19248  sun.nio.ch.SelectionKeyImpl (java.base@21.0.3)
 406:           798          19152  io.netty.handler.ssl.SslHandler$SslTasksRunner
 407:           399          19152  sun.nio.ch.SocketAdaptor (java.base@21.0.3)
 408:           399          19152  sun.security.ssl.SSLEngineInputRecord (java.base@21.0.3)
 409:           236          18880  [Ljava.util.concurrent.locks.Lock; (java.base@21.0.3)
 410:           236          18880  [Lorg.apache.lucene.index.ApproximatePriorityQueue;
 411:           470          18800  java.util.concurrent.CompletableFuture$UniWhenComplete (java.base@21.0.3)
 412:           470          18800  org.opensearch.index.mapper.Mapping
 413:           470          18800  org.opensearch.index.mapper.MappingLookup
 414:           470          18800  org.opensearch.index.seqno.RetentionLease
 415:           235          18800  org.opensearch.index.translog.LocalTranslog
 416:            32          18432  io.netty.util.internal.shaded.org.jctools.queues.atomic.MpscUnboundedAtomicArrayQueue
 417:           768          18432  java.util.concurrent.ConcurrentHashMap$KeySetView (java.base@21.0.3)
 418:           715          17160  java.time.format.DateTimeFormatterBuilder$DefaultValueParser (java.base@21.0.3)
 419:           715          17160  jdk.internal.reflect.DirectConstructorHandleAccessor (java.base@21.0.3)
 420:           714          17136  java.util.regex.Pattern$Start (java.base@21.0.3)
 421:          1066          17056  org.opensearch.common.inject.Initializables$1
 422:           710          17040  sun.reflect.generics.scope.ClassScope (java.base@21.0.3)
 423:           236          16992  org.apache.lucene.index.DocumentsWriterDeleteQueue
 424:           705          16920  java.util.zip.Inflater$InflaterZStreamRef (java.base@21.0.3)
 425:           235          16920  org.apache.lucene.analysis.miscellaneous.FingerprintFilter
 426:           235          16920  org.apache.lucene.index.StandardDirectoryReader
 427:           235          16920  org.opensearch.common.lucene.index.OpenSearchDirectoryReader
 428:           235          16920  org.opensearch.index.IndexModule
 429:           235          16920  org.opensearch.index.search.stats.ShardSearchStats$StatsHolder
 430:           235          16920  org.opensearch.index.translog.Checkpoint
 431:           139          16680  java.lang.Thread (java.base@21.0.3)
 432:           689          16536  sun.security.jca.ServiceId (java.base@21.0.3)
 433:           685          16440  com.fasterxml.jackson.databind.introspect.MemberKey
 434:           683          16392  org.opensearch.rest.RestMethodHandlers
 435:          1018          16288  java.util.AbstractMap$2 (java.base@21.0.3)
 436:           339          16272  java.lang.invoke.DirectMethodHandle$Accessor (java.base@21.0.3)
 437:           399          15960  io.netty.channel.socket.nio.NioSocketChannel$NioSocketChannelUnsafe
 438:           399          15960  org.opensearch.transport.InboundAggregator
 439:           399          15960  org.opensearch.transport.InboundDecoder
 440:           399          15960  org.opensearch.transport.netty4.Netty4TcpChannel
 441:           249          15936  java.lang.invoke.BoundMethodHandle$Species_LLLLLLL (java.base@21.0.3)
 442:           656          15744  java.util.zip.ZipFile$Source$Key (java.base@21.0.3)
 443:           653          15672  sun.reflect.generics.tree.ClassSignature (java.base@21.0.3)
 444:           632          15600  [[I (java.base@21.0.3)
 445:           973          15568  org.opensearch.threadpool.ScheduledCancellableAdapter
 446:           486          15552  org.opensearch.common.inject.ConstructorInjector
 447:           486          15552  org.opensearch.common.inject.MembersInjectorImpl
 448:           961          15376  java.util.concurrent.Semaphore (java.base@21.0.3)
 449:           320          15360  org.apache.lucene.analysis.StopFilter
 450:           960          15360  org.apache.lucene.codecs.lucene90.Lucene90CompoundFormat
 451:           960          15360  org.apache.lucene.codecs.lucene90.Lucene90LiveDocsFormat
 452:           960          15360  org.apache.lucene.codecs.lucene90.Lucene90NormsFormat
 453:           960          15360  org.apache.lucene.codecs.lucene90.Lucene90StoredFieldsFormat
 454:           638          15312  java.net.Inet4Address (java.base@21.0.3)
 455:           638          15312  java.net.InetSocketAddress$InetSocketAddressHolder (java.base@21.0.3)
 456:           955          15280  org.apache.lucene.codecs.lucene94.Lucene94FieldInfosFormat
 457:           237          15168  org.apache.lucene.index.LogByteSizeMergePolicy
 458:           947          15152  org.apache.lucene.codecs.lucene99.Lucene99SegmentInfoFormat
 459:           236          15104  org.apache.lucene.index.FieldInfos$FieldNumbers
 460:           471          15072  org.opensearch.index.engine.LiveVersionMap$VersionLookup
 461:           235          15040  org.opensearch.index.IndexingSlowLog
 462:           940          15040  org.opensearch.index.codec.PerFieldMappingPostingFormatCodec$$Lambda/0x00007f21512a8a80
 463:           940          15040  org.opensearch.index.codec.fuzzy.FuzzySetFactory
 464:           470          15040  org.opensearch.index.mapper.FieldTypeLookup
 465:           235          15040  org.opensearch.index.mapper.MapperService
 466:           235          15040  org.opensearch.index.mapper.SourceFieldMapper
 467:           235          15040  org.opensearch.index.shard.RefreshListeners
 468:           235          15040  org.opensearch.index.translog.ReplicationTranslogDeletionPolicy
 469:           470          15040  org.opensearch.index.translog.Translog$Location
 470:           235          15040  org.opensearch.indices.recovery.RecoveryState$Translog
 471:           235          15040  org.opensearch.indices.replication.common.ReplicationLuceneIndex
 472:           624          14976  org.apache.lucene.util.AttributeSource$State
 473:           825          14960  [Lsun.reflect.generics.tree.FormalTypeParameter; (java.base@21.0.3)
 474:           623          14952  org.opensearch.indexmanagement.rollup.interceptor.RollupInterceptor$interceptHandler$1
 475:           623          14952  org.opensearch.performanceanalyzer.transport.PerformanceAnalyzerTransportRequestHandler
 476:           623          14952  org.opensearch.security.OpenSearchSecurityPlugin$6$1
 477:           417          14816  [Ljava.time.format.DateTimeFormatterBuilder$DateTimePrinterParser; (java.base@21.0.3)
 478:           240          14808  [[B (java.base@21.0.3)
 479:           461          14752  com.fasterxml.jackson.databind.introspect.AnnotatedField
 480:           365          14600  org.opensearch.painless.spi.WhitelistClass
 481:           259          14504  java.lang.invoke.BoundMethodHandle$Species_LLLLL (java.base@21.0.3)
 482:           600          14400  io.netty.buffer.IntPriorityQueue
 483:           297          14256  java.lang.invoke.MethodHandleImpl$AsVarargsCollector (java.base@21.0.3)
 484:           296          14208  org.apache.lucene.analysis.CharArrayMap
 485:           352          14080  java.time.format.DateTimeFormatter (java.base@21.0.3)
 486:           586          14064  java.util.regex.Pattern$CharPropertyGreedy (java.base@21.0.3)
 487:           561          13904  [Ljava.security.cert.Certificate; (java.base@21.0.3)
 488:           433          13856  org.opensearch.common.inject.FactoryProxy
 489:           866          13856  org.opensearch.common.inject.Key$AnnotationInstanceStrategy
 490:           433          13856  org.opensearch.core.xcontent.ObjectParser$FieldParser
 491:           286          13728  java.lang.invoke.LambdaFormEditor$Transform (java.base@21.0.3)
 492:           118          13704  [Lorg.tartarus.snowball.Among;
 493:           338          13520  java.security.ProtectionDomain (java.base@21.0.3)
 494:           563          13512  java.lang.Long (java.base@21.0.3)
 495:           559          13416  java.security.BasicPermissionCollection (java.base@21.0.3)
 496:           557          13368  java.security.SecurityPermission (java.base@21.0.3)
 497:           236          13216  org.apache.lucene.index.DocumentsWriter
 498:           236          13216  org.apache.lucene.index.IndexFileDeleter
 499:           236          13216  org.apache.lucene.index.ReaderPool
 500:           235          13160  org.apache.lucene.analysis.miscellaneous.ASCIIFoldingFilter
 501:           235          13160  org.opensearch.index.engine.CombinedDeletionPolicy
 502:           235          13160  org.opensearch.index.store.Store
 503:           235          13160  org.opensearch.indices.recovery.RecoveryState
 504:           235          13160  org.opensearch.indices.recovery.RecoveryState$VerifyIndex
 505:           205          13120  io.netty.util.concurrent.ScheduledFutureTask
 506:           817          13072  org.opensearch.common.concurrent.CompletableContext
 507:           813          13008  org.opensearch.security.support.WildcardMatcher$Exact
 508:           540          12960  java.util.ImmutableCollections$List12 (java.base@21.0.3)
 509:           401          12832  io.netty.channel.DefaultChannelId
 510:           401          12832  sun.nio.ch.DummySocketImpl (java.base@21.0.3)
 511:           399          12768  io.netty.handler.ssl.SslHandler$SslHandlerCoalescingBufferQueue
 512:           798          12768  io.netty.handler.ssl.SslHandler$SslTasksRunner$1
 513:           399          12768  java.util.regex.Pattern$Curly (java.base@21.0.3)
 514:           399          12768  org.opensearch.transport.nativeprotocol.NativeInboundBytesHandler
 515:           399          12768  org.opensearch.transport.netty4.Netty4MessageChannelHandler
 516:           399          12768  org.opensearch.transport.netty4.OpenSearchLoggingHandler
 517:           798          12768  sun.security.ssl.Authenticator$TLS13Authenticator (java.base@21.0.3)
 518:           399          12768  sun.security.ssl.SSLEngineImpl (java.base@21.0.3)
 519:           394          12608  sun.security.util.ObjectIdentifier (java.base@21.0.3)
 520:           314          12560  org.apache.lucene.analysis.LowerCaseFilter
 521:           782          12512  java.util.concurrent.atomic.AtomicReferenceArray (java.base@21.0.3)
 522:           758          12128  org.opensearch.security.support.WildcardMatcher$SimpleMatcher
 523:           500          12000  org.opensearch.cluster.metadata.IndexGraveyard$Tombstone
 524:           490          11760  org.opensearch.core.action.ActionListener$1
 525:           488          11712  org.opensearch.securityanalytics.model.LogType$Mapping
 526:           486          11664  org.opensearch.common.inject.DefaultConstructionProxyFactory$1
 527:           486          11664  org.opensearch.common.inject.spi.InjectionPoint
 528:           653          11576  [Lsun.reflect.generics.tree.ClassTypeSignature; (java.base@21.0.3)
 529:           481          11544  java.util.regex.Pattern$$Lambda/0x80000002b (java.base@21.0.3)
 530:           239          11472  org.opensearch.common.settings.IndexScopedSettings
 531:           237          11376  java.util.concurrent.ArrayBlockingQueue (java.base@21.0.3)
 532:           237          11376  org.apache.lucene.index.CompositeReaderContext
 533:           531          11360  [Lsun.reflect.generics.tree.FieldTypeSignature; (java.base@21.0.3)
 534:           236          11328  org.apache.lucene.index.BufferedUpdates
 535:           236          11328  org.apache.lucene.index.DocumentsWriter$$Lambda/0x00007f21510cae38
 536:           236          11328  org.apache.lucene.index.IndexFileDeleter$CommitPoint
 537:           472          11328  org.opensearch.core.common.text.Text
 538:           235          11280  org.apache.lucene.index.SoftDeletesRetentionMergePolicy
 539:           235          11280  org.apache.lucene.store.MMapDirectory
 540:           470          11280  org.opensearch.common.util.concurrent.KeyedLock
 541:           235          11280  org.opensearch.index.IndexService$AsyncGlobalCheckpointTask
 542:           235          11280  org.opensearch.index.IndexService$AsyncRefreshTask
 543:           235          11280  org.opensearch.index.IndexService$AsyncRetentionLeaseSyncTask
 544:           235          11280  org.opensearch.index.IndexService$AsyncTrimTranslogTask
 545:           470          11280  org.opensearch.index.analysis.FieldNameAnalyzer
 546:           470          11280  org.opensearch.index.engine.LiveVersionMap$Maps
 547:           235          11280  org.opensearch.index.engine.SoftDeletesPolicy
 548:           235          11280  org.opensearch.index.fielddata.IndexFieldDataService
 549:           235          11280  org.opensearch.index.get.ShardGetService
 550:           235          11280  org.opensearch.index.mapper.DocumentMapperParser
 551:           470          11280  org.opensearch.index.mapper.DocumentParser
 552:           470          11280  org.opensearch.index.mapper.DynamicKeyFieldTypeLookup
 553:           235          11280  org.opensearch.index.store.DirectoryFileTransferTracker
 554:           235          11280  org.opensearch.index.store.FsDirectoryFactory$HybridDirectory
 555:           235          11280  org.opensearch.index.translog.InternalTranslogManager
 556:           235          11280  org.opensearch.indices.replication.common.ReplicationTimer
 557:           343          10976  com.fasterxml.jackson.databind.util.internal.PrivateMaxEntriesMap$Node
 558:           274          10960  java.lang.ref.Finalizer (java.base@21.0.3)
 559:           341          10912  jdk.internal.math.FDBigInteger (java.base@21.0.3)
 560:           341          10912  org.opensearch.common.settings.Setting$IntegerParser
 561:           135          10800  java.net.URI (java.base@21.0.3)
 562:           270          10800  org.opensearch.common.util.concurrent.BaseFuture$Sync
 563:           446          10704  io.netty.util.internal.logging.LocationAwareSlf4JLogger
 564:           446          10704  java.util.concurrent.ConcurrentSkipListMap$Node (java.base@21.0.3)
 565:           264          10560  sun.security.util.KnownOIDs (java.base@21.0.3)
 566:           656          10496  sun.nio.fs.UnixFileAttributes$UnixAsBasicFileAttributes (java.base@21.0.3)
 567:           187          10472  java.lang.invoke.BoundMethodHandle$Species_LLLLLL (java.base@21.0.3)
 568:           327          10464  sun.reflect.generics.reflectiveObjects.TypeVariableImpl (java.base@21.0.3)
 569:           433          10392  org.opensearch.common.inject.InjectorImpl$4
 570:           433          10392  org.opensearch.common.inject.multibindings.MapBinder$RealMapBinder$MapEntry
 571:           433          10392  org.opensearch.common.inject.spi.ProviderLookup
 572:           433          10392  org.opensearch.plugins.ActionPlugin$ActionHandler
 573:           323          10336  org.apache.lucene.analysis.ReusableStringReader
 574:           430          10320  org.apache.lucene.util.packed.DirectReader$DirectPackedReader2
 575:           416          10248  [Lorg.opensearch.sql.data.type.ExprType;
 576:           638          10208  java.net.InetSocketAddress (java.base@21.0.3)
 577:           423          10152  org.opensearch.core.xcontent.ObjectParser$$Lambda/0x00007f2150487068
 578:           408          10120  [Lorg.opensearch.common.settings.Setting$Property;
 579:           253          10120  org.opensearch.indices.replication.common.ReplicationCollection$ReplicationMonitor
 580:           417          10008  java.time.format.DateTimeFormatterBuilder$CompositePrinterParser (java.base@21.0.3)
 581:           250          10000  org.opensearch.common.settings.Setting$1
 582:           415           9960  [Lio.netty.util.concurrent.GenericFutureListener;
 583:           415           9960  io.netty.util.concurrent.DefaultFutureListeners
 584:           414           9936  org.opensearch.sql.expression.function.FunctionDSL$$Lambda/0x00007f2150b36428
 585:           412           9888  java.util.ImmutableCollections$SetN (java.base@21.0.3)
 586:           411           9864  [Ljava.nio.channels.SelectionKey; (java.base@21.0.3)
 587:           615           9840  sun.security.ssl.SessionId (java.base@21.0.3)
 588:           408           9792  java.io.ByteArrayOutputStream (java.base@21.0.3)
 589:           402           9640  [Lio.netty.util.DefaultAttributeMap$DefaultAttribute;
 590:           401           9624  io.netty.channel.SucceededChannelFuture
 591:           401           9624  io.netty.util.DefaultAttributeMap$DefaultAttribute
 592:           401           9624  java.util.regex.Pattern$$Lambda/0x800000033 (java.base@21.0.3)
 593:           399           9576  io.netty.channel.PendingBytesTracker$DefaultChannelPipelinePendingBytesTracker
 594:           399           9576  org.opensearch.transport.CopyBytesSocketChannel$WriteConfig
 595:           399           9576  org.opensearch.transport.TcpChannel$ChannelStats
 596:           171           9576  sun.security.jca.ProviderList$ServiceList (java.base@21.0.3)
 597:           399           9576  sun.security.ssl.HandshakeHash (java.base@21.0.3)
 598:           399           9576  sun.security.ssl.SSLEngineOutputRecord$HandshakeFragment (java.base@21.0.3)
 599:           239           9560  org.apache.lucene.util.packed.Packed64
 600:           396           9504  java.security.Permissions (java.base@21.0.3)
 601:           237           9480  org.opensearch.index.OpenSearchTieredMergePolicy
 602:           237           9480  sun.nio.ch.FileLockImpl (java.base@21.0.3)
 603:           169           9464  java.lang.Module (java.base@21.0.3)
 604:           236           9440  org.apache.lucene.index.BufferedUpdatesStream
 605:           236           9440  org.apache.lucene.util.ByteBlockPool
 606:           235           9400  org.opensearch.index.cache.bitset.BitsetFilterCache
 607:           235           9400  org.opensearch.index.engine.InternalEngine$ExternalReaderManager
 608:           235           9400  org.opensearch.index.engine.PrunePostingsMergePolicy
 609:           235           9400  org.opensearch.index.engine.RecoverySourcePruneMergePolicy
 610:           235           9400  org.opensearch.index.mapper.DataStreamFieldMapper
 611:           235           9400  org.opensearch.index.mapper.FieldNamesFieldMapper
 612:           235           9400  org.opensearch.index.mapper.FieldNamesFieldMapper$FieldNamesFieldType
 613:           235           9400  org.opensearch.index.mapper.IdFieldMapper$IdFieldType
 614:           235           9400  org.opensearch.index.mapper.Mapper$TypeParser$ParserContext
 615:           235           9400  org.opensearch.index.mapper.RoutingFieldMapper
 616:           235           9400  org.opensearch.index.mapper.SourceFieldMapper$SourceFieldType
 617:           235           9400  org.opensearch.index.shard.GlobalCheckpointListeners
 618:           235           9400  org.opensearch.index.shard.IndexShard$7
 619:           235           9400  org.opensearch.index.shard.IndexShardOperationPermits
 620:           235           9400  org.opensearch.index.shard.InternalIndexingStats$StatsHolder
 621:           235           9400  org.opensearch.index.shard.OpenSearchMergePolicy
 622:           235           9400  org.opensearch.index.store.ByteSizeCachingDirectory
 623:           235           9400  org.opensearch.index.store.ByteSizeCachingDirectory$1
 624:           235           9400  org.opensearch.index.translog.TranslogConfig
 625:            16           9216  io.netty.util.internal.shaded.org.jctools.queues.atomic.MpscUnboundedAtomicArrayQueue
 626:           383           9192  org.opensearch.sql.expression.function.FunctionDSL$$Lambda/0x00007f2150b24690
 627:           287           9184  java.util.regex.Pattern$BnM (java.base@21.0.3)
 628:           104           9152  com.fasterxml.jackson.databind.ser.BeanPropertyWriter
 629:           570           9120  org.apache.lucene.util.BytesRefBuilder
 630:           190           9120  org.jcodings.util.CaseInsensitiveBytesHash$CaseInsensitiveBytesHashEntry
 631:           283           9056  java.nio.file.attribute.FileTime (java.base@21.0.3)
 632:           377           9048  java.lang.module.ModuleDescriptor$Exports (java.base@21.0.3)
 633:           112           8960  org.apache.logging.log4j.core.util.datetime.FixedDateFormat
 634:           272           8704  org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable
 635:           542           8672  sun.reflect.generics.tree.TypeVariableSignature (java.base@21.0.3)
 636:            83           8632  org.opensearch.security.configuration.DlsFlsFilterLeafReader
 637:           267           8544  java.time.format.DateTimeFormatterBuilder$NumberPrinterParser (java.base@21.0.3)
 638:           265           8480  com.google.common.collect.RegularImmutableSet
 639:           265           8480  java.util.concurrent.CountDownLatch$Sync (java.base@21.0.3)
 640:           353           8472  java.io.FilePermissionCollection (java.base@21.0.3)
 641:           526           8416  org.apache.logging.log4j.message.ReusableSimpleMessage
 642:           208           8320  sun.security.pkcs11.SunPKCS11$Descriptor (jdk.crypto.cryptoki@21.0.3)
 643:           343           8232  com.fasterxml.jackson.databind.util.internal.PrivateMaxEntriesMap$WeightedValue
 644:           256           8192  io.netty.handler.codec.CodecOutputList
 645:           341           8184  java.util.regex.Pattern$Ques (java.base@21.0.3)
 646:           332           7968  sun.reflect.generics.tree.FormalTypeParameter (java.base@21.0.3)
 647:           329           7896  org.opensearch.common.collect.Tuple
 648:           488           7808  java.util.regex.Pattern$$Lambda/0x800000032 (java.base@21.0.3)
 649:           486           7776  org.opensearch.common.inject.ConstructorBindingImpl$Factory
 650:           486           7776  org.opensearch.common.inject.DefaultConstructionProxyFactory
 651:           323           7752  [Lorg.apache.lucene.util.AttributeSource$State;
 652:           323           7752  org.apache.lucene.analysis.Analyzer$TokenStreamComponents
 653:           322           7728  java.util.regex.Pattern$CharProperty (java.base@21.0.3)
 654:           192           7680  java.lang.Thread$FieldHolder (java.base@21.0.3)
 655:           160           7680  java.lang.invoke.BoundMethodHandle$Species_LLL (java.base@21.0.3)
 656:           160           7680  java.net.URLPermission (java.base@21.0.3)
 657:           318           7632  java.util.ImmutableCollections$Set12 (java.base@21.0.3)
 658:           237           7584  java.util.concurrent.Semaphore$FairSync (java.base@21.0.3)
 659:           237           7584  org.apache.lucene.store.NativeFSLockFactory$NativeFSLock
 660:           237           7584  sun.nio.ch.FileKey (java.base@21.0.3)
 661:           237           7584  sun.nio.ch.FileLockTable$FileLockReference (java.base@21.0.3)
 662:           470           7552  [Lorg.opensearch.index.mapper.DynamicTemplate;
 663:           118           7552  java.lang.invoke.BoundMethodHandle$Species_LLLLLLLL (java.base@21.0.3)
 664:           472           7552  jdk.proxy2.$Proxy32 (jdk.proxy2)
 665:           236           7552  org.apache.lucene.index.BufferedUpdates$DeletedTerms
 666:           236           7552  org.apache.lucene.index.BufferedUpdatesStream$FinishedSegments
 667:           236           7552  org.apache.lucene.index.DocumentsWriterPerThreadPool
 668:           236           7552  org.apache.lucene.index.IndexWriter$EventQueue
 669:           472           7552  org.apache.lucene.util.Counter$AtomicCounter
 670:           236           7552  org.apache.lucene.util.FrequencyTrackingRingBuffer
 671:           236           7552  org.opensearch.index.seqno.RetentionLeases
 672:           235           7520  org.apache.lucene.document.NumericDocValuesField
 673:           235           7520  org.apache.lucene.index.ShuffleForcedMergePolicy
 674:           235           7520  org.opensearch.action.support.replication.PendingReplicationActions
 675:           235           7520  org.opensearch.analysis.common.FingerprintAnalyzer
 676:           470           7520  org.opensearch.common.concurrent.CompletableContext$$Lambda/0x00007f21510fe008
 677:           470           7520  org.opensearch.common.settings.Setting$$Lambda/0x00007f21500be090
 678:           470           7520  org.opensearch.core.action.ActionListener$$Lambda/0x00007f21510fd518
 679:           235           7520  org.opensearch.env.NodeEnvironment$1
 680:           235           7520  org.opensearch.env.NodeEnvironment$InternalShardLock
 681:           235           7520  org.opensearch.index.cache.IndexCache
 682:           235           7520  org.opensearch.index.cache.bitset.ShardBitsetFilterCache
 683:           235           7520  org.opensearch.index.cache.request.ShardRequestCache
 684:           235           7520  org.opensearch.index.engine.Engine$IndexThrottle
 685:           235           7520  org.opensearch.index.engine.LiveVersionMap
 686:           235           7520  org.opensearch.index.mapper.DocCountFieldMapper
 687:           235           7520  org.opensearch.index.mapper.IdFieldMapper
 688:           235           7520  org.opensearch.index.mapper.IgnoredFieldMapper
 689:           235           7520  org.opensearch.index.mapper.IndexFieldMapper
 690:           235           7520  org.opensearch.index.mapper.RankFeatureMetaFieldMapper
 691:           235           7520  org.opensearch.index.mapper.SeqNoFieldMapper
 692:           235           7520  org.opensearch.index.mapper.VersionFieldMapper
 693:           235           7520  org.opensearch.index.seqno.LocalCheckpointTracker
 694:           235           7520  org.opensearch.index.shard.IndexShard$RefreshMetricUpdater
 695:           235           7520  org.opensearch.index.shard.ShardPath
 696:           235           7520  org.opensearch.index.similarity.SimilarityService
 697:           235           7520  org.opensearch.index.store.ByteSizeCachingDirectory$SizeAndModCount
 698:           235           7520  org.opensearch.index.translog.TranslogHeader
 699:           235           7520  org.opensearch.index.warmer.ShardIndexWarmerService
 700:           233           7456  org.apache.logging.log4j.core.config.plugins.processor.PluginEntry
 701:           154           7392  org.apache.lucene.analysis.CharArrayMap$UnmodifiableCharArrayMap
 702:             2           7320  [Ljava.lang.Character$UnicodeScript; (java.base@21.0.3)
 703:           130           7280  sun.util.calendar.ZoneInfo (java.base@21.0.3)
 704:           452           7232  org.opensearch.core.action.ActionListener$$Lambda/0x00007f21502941f8
 705:           452           7232  org.opensearch.core.action.ActionListener$$Lambda/0x00007f2150294418
 706:           450           7200  org.apache.lucene.analysis.CharArraySet
 707:           297           7128  javax.management.ImmutableDescriptor (java.management@21.0.3)
 708:           295           7080  java.util.regex.Pattern$StartS (java.base@21.0.3)
 709:           433           6928  org.opensearch.common.inject.spi.ProviderLookup$ProviderImpl
 710:             2           6912  [Lorg.jcodings.unicode.UnicodeCodeRange;
 711:           123           6888  java.net.SocketPermission (java.base@21.0.3)
 712:           215           6880  sun.nio.fs.UnixFileKey (java.base@21.0.3)
 713:           428           6848  org.opensearch.action.support.HandledTransportAction$TransportHandler
 714:            48           6792  [Ljava.lang.ClassValue$Entry; (java.base@21.0.3)
 715:           283           6792  org.opensearch.common.settings.Setting$$Lambda/0x00007f2150090238
 716:           277           6752  [Lcom.fasterxml.jackson.databind.JavaType;
 717:           204           6528  org.opensearch.sql.expression.function.BuiltinFunctionName
 718:           401           6416  io.netty.channel.VoidChannelPromise$1
 719:           401           6416  io.netty.channel.nio.AbstractNioChannel$1
 720:           401           6416  org.opensearch.transport.netty4.Netty4TcpChannel$$Lambda/0x00007f2151081a00
 721:           400           6400  io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle$1
 722:           399           6384  io.netty.channel.nio.AbstractNioByteChannel$1
 723:           266           6384  org.opensearch.core.common.io.stream.NamedWriteableRegistry$Entry
 724:           399           6384  org.opensearch.transport.InboundAggregator$$Lambda/0x00007f2151141420
 725:           399           6384  org.opensearch.transport.TcpTransport$$Lambda/0x00007f21510ffc50
 726:           399           6384  org.opensearch.transport.netty4.Netty4MessageChannelHandler$$Lambda/0x00007f21510fa760
 727:           399           6384  org.opensearch.transport.netty4.Netty4MessageChannelHandler$$Lambda/0x00007f21510fb420
 728:           399           6384  org.opensearch.transport.netty4.Netty4MessageChannelHandler$$Lambda/0x00007f2151144240
 729:           399           6384  org.opensearch.transport.netty4.Netty4Transport$$Lambda/0x00007f21510f8240
 730:           399           6384  sun.security.ssl.HandshakeHash$CacheOnlyHash (java.base@21.0.3)
 731:           397           6352  org.opensearch.sql.expression.function.FunctionDSL$$Lambda/0x00007f2150b353f8
 732:           158           6320  sun.reflect.generics.repository.MethodRepository (java.base@21.0.3)
 733:           195           6240  com.sun.jmx.mbeanserver.ConvertingMethod (java.management@21.0.3)
 734:           254           6096  org.opensearch.security.support.WildcardMatcher$MatcherCombiner
 735:           253           6072  com.google.common.collect.ImmutableMapEntry
 736:           249           5976  org.opensearch.index.fielddata.plain.SortedSetBytesLeafFieldData
 737:           106           5936  java.nio.HeapCharBuffer (java.base@21.0.3)
 738:           185           5920  org.opensearch.painless.lookup.PainlessConstructor
 739:            41           5904  org.opensearch.repositories.s3.async.SizeBasedBlockingQ$Consumer
 740:             3           5744  [[Lorg.opensearch.search.aggregations.bucket.terms.StringTerms$Bucket;
 741:           237           5688  org.apache.lucene.search.similarities.BM25Similarity
 742:           237           5688  org.opensearch.index.LogByteSizeMergePolicyProvider
 743:           237           5688  org.opensearch.index.MergeSchedulerConfig
 744:           237           5688  org.opensearch.index.TieredMergePolicyProvider
 745:           237           5688  org.opensearch.index.remote.RemoteStorePathStrategy
 746:           237           5688  sun.nio.ch.FileLockTable (java.base@21.0.3)
 747:           236           5664  org.apache.lucene.index.ConcurrentApproximatePriorityQueue
 748:           236           5664  org.apache.lucene.index.DocumentsWriterDeleteQueue$DeleteSlice
...
Total      26089884    33026508368

Also, I'm assuming you are not running any painless scripts.

No, the only thing running in the cluster is this query which is basically nested terms aggregation (with huge sizes) + date histogram.

kkhatua commented 4 days ago

@Pigueiras , we'll need to dig deeper. This is a good start. But we'll need multiple histogram dumps at regular intervals to see what is growing rapidly. Looking at the metrics shared here (https://github.com/opensearch-project/OpenSearch/issues/15413#issuecomment-2323055937) , my guess is that the backpressure module has a very narrow window of time to detect this. Adding @kaushalmahi12 and @sgup432 to see if they have some tuning suggestions after they're done with the 2.17 release items. In the meantime, it might be a good idea to take histogram and hot_thread dumps every fixed interval so that we can see between successive increases, what new objects are rapidly accumulating and what operations were actively being run.

Pigueiras commented 4 days ago

@kkhatua I created this repository with a dump of the histogram and hot_thread approximately every second from the beginning of a query until the OOME. The files are in HHMMSS.sss format, and the OOME occurs at 22:13:44.

kkhatua commented 4 days ago

This will take some time, @Pigueiras Looking at the metrics by just raw Java objects... there is a clear growth in allocation but not a proportional grown in the number of instances:

===221250.998===
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:       9645603      555071536  [B (java.base@21.0.3)
   2:         20673      239992792  [Ljdk.internal.vm.FillerElement; (java.base@21.0.3)
   3:        778204      165436520  [J (java.base@21.0.3)
===221310.010===
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:        167791     8060421456  [Ljava.lang.Object; (java.base@21.0.3)
   2:         13020      807425672  [Ljdk.internal.vm.FillerElement; (java.base@21.0.3)
   3:       5477994      284541896  [B (java.base@21.0.3)
===221320.062===
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:        163015    16527807568  [Ljava.lang.Object; (java.base@21.0.3)
   2:         20182     1195993792  [Ljdk.internal.vm.FillerElement; (java.base@21.0.3)
   3:       5474736      282744968  [B (java.base@21.0.3)
===221330.131===
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:        164115    24883669456  [Ljava.lang.Object; (java.base@21.0.3)
   2:         24545     1569997016  [Ljdk.internal.vm.FillerElement; (java.base@21.0.3)
   3:       5478403      282984872  [B (java.base@21.0.3)
===221340.938===
 num     #instances         #bytes  class name (module)
-------------------------------------------------------
   1:        148297    30195499744  [Ljava.lang.Object; (java.base@21.0.3)
   2:       5448453      280126496  [B (java.base@21.0.3)
   3:        631937      148145208  [J (java.base@21.0.3)

From the few usable hot_threads , the stack shows sub aggregations being executed in nested calls:

...
       app//org.opensearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForBuckets(BucketsAggregator.java:220)
       app//org.opensearch.search.aggregations.bucket.BucketsAggregator.buildSubAggsForAllBuckets(BucketsAggregator.java:286)
...

Still unclear why isn't the Search Backpressure (SBP) module detecting this from the time it sees the allocations climb rapidly from 221310.010 to the last sample at 221340.93 (~30 sec). It might have to do with what is the sampling rate of the SBP to look behind before assessing that the node is in duress.

I believe you've already set this search_backpressure.node_duress.num_successive_breaches and can consider lowering search_backpressure.node_duress.heap_threshold. Will wait for others to chime in.

Pigueiras commented 3 days ago

This will take some time

No problem, I completely understand that I have only one issue and you have to handle many. Don’t feel obligated to answer quickly if I do 👍

I believe you've already set this search_backpressure.node_duress.num_successive_breaches and can consider lowering search_backpressure.node_duress.heap_threshold. Will wait for others to chime in.

I have tried with

PUT _cluster/settings
{
  "transient": {
    "search_backpressure": {
      "node_duress": {
        "heap_threshold": "0.0001"
      }
    }
  }
}

So the nodes are always considered under duress, yet I cannot see any logs about backpressure. The only ones that appear related to SBP are the ones when the data node starts and changes the default values:

image