Closed gwbrown closed 6 months ago
Pinging @elastic/es-data-management (Team:Data Management)
I see this in the log:
1> [2024-01-17T18:16:52,768][INFO ][o.e.s.WatcherYamlRestIT ] [test] expected 0 active watches, but got [1], deleting watcher indices again
WatcherYamlRestIT attempts to delete all watches before the tests start. So $watch_count_active
ought to be 0. We can see from the failure message though that it is 1. The delete request blocks and there's no message that it failed. So I'm wondering if the problem is just that we need a refresh in here.
So I'm wondering if the problem is just that we need a refresh in here.
That was a dead end -- TransportDeleteWatchAction already sets refresh to IMMEDIATE.
In looking a little more, I don't think that the refresh policy matters anyway -- watch_count_active
is set from the xpack usage API. And from what I can tell, this comes from TriggerService, which is notified to remove the watch when the delete is made:
org.elasticsearch.xpack.watcher.trigger.TriggerService.remove(TriggerService.java:159)
org.elasticsearch.xpack.watcher.WatcherIndexingListener.preDelete(WatcherIndexingListener.java:183)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.index.shard.IndexingOperationListener$CompositeListener.preDelete(IndexingOperationListener.java:119)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.index.shard.IndexShard.applyDeleteOperation(IndexShard.java:1195)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.index.shard.IndexShard.applyDeleteOperationOnPrimary(IndexShard.java:1153)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.action.bulk.TransportShardBulkAction.executeBulkItemRequest(TransportShardBulkAction.java:345)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.action.bulk.TransportShardBulkAction$2.doRun(TransportShardBulkAction.java:226)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.action.bulk.TransportShardBulkAction.performOnPrimary(TransportShardBulkAction.java:293)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:144)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.action.bulk.TransportShardBulkAction.dispatchedShardOperationOnPrimary(TransportShardBulkAction.java:76)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.action.support.replication.TransportWriteAction$1.doRun(TransportWriteAction.java:216)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:33)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:984)
org.elasticsearch.server@8.13.0-SNAPSHOT/org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
java.base/java.lang.Thread.run(Thread.java:1583)
Duplicate of #65547.
Test has failed a few times in the past 30d, most concerning message is:
Build scan: https://gradle-enterprise.elastic.co/s/keww7jvdbenyg/tests/:x-pack:plugin:watcher:qa:rest:yamlRestTestV7CompatTest/org.elasticsearch.smoketest.WatcherYamlRestIT/test%20%7Bp0=watcher%2Fusage%2F10_basic%2FTest%20watcher%20usage%20stats%20output%7D
Reproduction line:
Applicable branches: 8.12, main
Reproduces locally?: No
Failure history: Failure dashboard for
org.elasticsearch.smoketest.WatcherYamlRestIT#test {p0=watcher/usage/10_basic/Test watcher usage stats output}
&_a=(controlGroupInput:(chainingSystem:HIERARCHICAL,controlStyle:twoLine,ignoreParentSettings:(ignoreFilters:!f,ignoreQuery:!f,ignoreTimerange:!f,ignoreValidations:!t),panels:('0c0c9cb8-ccd2-45c6-9b13-96bac4abc542':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:task.keyword,grow:!t,id:'0c0c9cb8-ccd2-45c6-9b13-96bac4abc542',searchTechnique:wildcard,selectedOptions:!(),singleSelect:!t,title:'Gradle%20Task',width:medium),grow:!t,order:0,type:optionsListControl,width:small),'144933da-5c1b-4257-a969-7f43455a7901':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:name.keyword,grow:!t,id:'144933da-5c1b-4257-a969-7f43455a7901',searchTechnique:wildcard,selectedOptions:!('test%20%7Bp0%3Dwatcher/usage/10_basic/Test%20watcher%20usage%20stats%20output%7D'),title:Test,width:medium),grow:!t,order:2,type:optionsListControl,width:medium),'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:className.keyword,grow:!t,id:'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850',searchTechnique:wildcard,selectedOptions:!('org.elasticsearch.smoketest.WatcherYamlRestIT'),title:Suite,width:medium),grow:!t,order:1,type:optionsListControl,width:medium)))))Failure excerpt: