Open jfreden opened 6 months ago
Pinging @elastic/es-core-infra (Team:Core/Infra)
this test is failing because of search.threads.queue.size
being different in threadpool stats and metric.
[0006-05-03T15:16:51,033][INFO ][o.e.t.SimpleThreadPoolIT ] [testThreadPoolMetrics] Stats of `search`: {search.threads.active.current=0, search.threads.completed.total=475, search.threads.count.current=7, search.threads.largest.current=7, search.threads.queue.size=1} |
-- | --
| [0006-05-03T15:16:51,033][INFO ][o.e.t.SimpleThreadPoolIT ] [testThreadPoolMetrics] Measurements of `search`: {search.threads.active.current=[0], search.threads.completed.total=[476], search.threads.count.current=[7], search.threads.largest.current=[7], search.threads.queue.size=[0]}
we are waiting for the threadpool stats to report that there is no active thread. A line later we collect metric measurments. I suspect that in that moment there might be a new thread submitted hence the threadpool stat is reporting 0, when the metric mesurment is reporting 1.
@mosche wdyt? you worked on hardening this test before
I wonder if there is a way to reliably and gently shutdown a threadpool or EsIntegTest so that we have a 'frozen' es node that we can assert about
I see a few options:
test is failing due to threadpool stats vs apm metric discrepancy. most likely a timing issue assesing the risk to low
This has been muted on branch main
Mute Reasons:
Build Scans:
This has been muted on branch 8.x
Mute Reasons:
Build Scans:
This has been muted on branch 8.x
Mute Reasons:
Build Scans:
This issue has been closed because it has been open for too long with no activity.
Any muted tests that were associated with this issue have been unmuted.
If the tests begin failing again, a new issue will be opened, and they may be muted again.
This has been muted on branch main
Mute Reasons:
Build Scans:
This has been muted on branch 8.x
Mute Reasons:
Build Scans:
This has been muted on branch 8.16
Mute Reasons:
Build Scans:
Build scan: https://gradle-enterprise.elastic.co/s/yebknro47qsu4/tests/:server:internalClusterTest/org.elasticsearch.threadpool.SimpleThreadPoolIT/testThreadPoolMetrics
Reproduction line:
Applicable branches: main
Reproduces locally?: Didn't try
Failure history: Failure dashboard for
org.elasticsearch.threadpool.SimpleThreadPoolIT#testThreadPoolMetrics
&_a=(controlGroupInput:(chainingSystem:HIERARCHICAL,controlStyle:twoLine,ignoreParentSettings:(ignoreFilters:!f,ignoreQuery:!f,ignoreTimerange:!f,ignoreValidations:!t),panels:('0c0c9cb8-ccd2-45c6-9b13-96bac4abc542':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:task.keyword,grow:!t,id:'0c0c9cb8-ccd2-45c6-9b13-96bac4abc542',searchTechnique:wildcard,selectedOptions:!(),singleSelect:!t,title:'Gradle%20Task',width:medium),grow:!t,order:0,type:optionsListControl,width:small),'144933da-5c1b-4257-a969-7f43455a7901':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:name.keyword,grow:!t,id:'144933da-5c1b-4257-a969-7f43455a7901',searchTechnique:wildcard,selectedOptions:!('testThreadPoolMetrics'),title:Test,width:medium),grow:!t,order:2,type:optionsListControl,width:medium),'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:className.keyword,grow:!t,id:'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850',searchTechnique:wildcard,selectedOptions:!('org.elasticsearch.threadpool.SimpleThreadPoolIT'),title:Suite,width:medium),grow:!t,order:1,type:optionsListControl,width:medium)))))Failure excerpt: