Closed davidkyle closed 6 months ago
Pinging @elastic/es-distributed (Team:Distributed)
This seems like a test issue which may or may not be related to https://github.com/elastic/elasticsearch/pull/102516. But I think this is for delivery team maybe. Feel free to relabel if that's not the case.
Pinging @elastic/es-delivery (Team:Delivery)
I don't have much insight into the internal test cluster fixtures. This is an area with fuzzy ownership to be honest but I think core/infra is the most appropriate.
Pinging @elastic/es-core-infra (Team:Core/Infra)
I investigated this a bit, but could not come to any conclusions. The assertion is cryptic, but it results from a MockLogAppender not being started when a message is appended to it. Yet that check happens within log4j (AppenderControl), and goes to the status logger. So we don't have any control or ability to fail harder when this happens to find where it is actually occurring.
My only thought for a potential fix is to always start MockLogAppender instances within the constructor. I don't see a downside to this, but perhaps I'm missing something.
I think it might just not be threadsafe to remove appenders from loggers and stop the appender while other threads might be logging stuff. The following test pretty reliably fails for me:
public void testConcurrentLogAndLifecycle() throws Exception {
final var keepGoing = new AtomicBoolean(true);
final var logThread = new Thread(() -> {
while (keepGoing.get()) {
logger.info("test");
}
});
logThread.start();
final var appender = new MockLogAppender();
for (int i = 0; i < 1000; i++) {
try (var ignored = appender.capturing(MockLogAppenderTests.class)) {
Thread.yield();
}
}
keepGoing.set(false);
logThread.join();
}
Thanks @DaveCTurner, you are right, it appears modifying appenders is simply not threadsafe. I have an idea which seems to work: adding a single appender during test static init (much like some other appenders we add in ESTestCase). That single appender internally keeps track of mapping between loggers and expections in a threadsafe way. Witht hat change I was able to get your example test above to pass consistently over many runs.
However, there are a lot of places adding the MockLogAppender to appenders directly which needs to be addressed first. I've opened https://github.com/elastic/elasticsearch/pull/108114 to do that.
Build scan: https://gradle-enterprise.elastic.co/s/a5dvvhunctzui/tests/:server:internalClusterTest/org.elasticsearch.discovery.single.SingleNodeDiscoveryIT/testCannotJoinNodeWithSingleNodeDiscovery
Reproduction line:
Applicable branches: main
Reproduces locally?: Didn't try
Failure history: Failure dashboard for
org.elasticsearch.discovery.single.SingleNodeDiscoveryIT#testCannotJoinNodeWithSingleNodeDiscovery
&_a=(controlGroupInput:(chainingSystem:HIERARCHICAL,controlStyle:twoLine,ignoreParentSettings:(ignoreFilters:!f,ignoreQuery:!f,ignoreTimerange:!f,ignoreValidations:!t),panels:('0c0c9cb8-ccd2-45c6-9b13-96bac4abc542':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:task.keyword,grow:!t,id:'0c0c9cb8-ccd2-45c6-9b13-96bac4abc542',searchTechnique:wildcard,selectedOptions:!(),singleSelect:!t,title:'Gradle%20Task',width:medium),grow:!t,order:0,type:optionsListControl,width:small),'144933da-5c1b-4257-a969-7f43455a7901':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:name.keyword,grow:!t,id:'144933da-5c1b-4257-a969-7f43455a7901',searchTechnique:wildcard,selectedOptions:!('testCannotJoinNodeWithSingleNodeDiscovery'),title:Test,width:medium),grow:!t,order:2,type:optionsListControl,width:medium),'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850':(explicitInput:(dataViewId:fbbdc689-be23-4b3d-8057-aa402e9ed0c5,enhancements:(),fieldName:className.keyword,grow:!t,id:'4e6ad9d6-6fdc-4fcc-bf1a-aa6ca79e0850',searchTechnique:wildcard,selectedOptions:!('org.elasticsearch.discovery.single.SingleNodeDiscoveryIT'),title:Suite,width:medium),grow:!t,order:1,type:optionsListControl,width:medium)))))Failure excerpt: