rfoltyns / log4j2-elasticsearch

Log4j2 Elasticsearch Appender plugins
Apache License 2.0
174 stars 46 forks source link

GenericItemSourcePool.unlatchAndResetResizing might cause an error #96

Open ol-loginov opened 3 months ago

ol-loginov commented 3 months ago

Description I adapt your code to my needs. I have a fork in private repository and mostly trying to use spring-boot-dependcies to manage with jar hell in dependent projects.

So, I've met some tests failing. GenericItemSourcePoolTest.multipleThreadsGetPooledWhenResizePolicyEventuallyCopeWithTheLoad.

After investigation I can report some notes about implementation. I'd change (and did locally) method GenericItemSourcePool.unlatchAndResetResizing to this variant:

    // you cannot count down latch while resizing is true (in progress). see comments in #awaitResize method 
    private final Consumer<Boolean> unlatchAndResetResizing = result -> {
        var latch = this.countDownLatch.get(); 
        resizing.set(false);
        latch.countDown();
    };

And my comments for awaitResize method:

        // Here some notes about unlatchAndResetResizing. It will fail if another thread is superfast. Look at the following use case:
        // [Thread A] starts resizing (set resizing to true and adds latch, for instance named "Latch1"). The process begins
        // [Thread B] got resizing=true, pass to await - take Latch1 and wait for timeout
        // [Thread A] finish the process and begins "unlatchAndResetResizing" - down Latch1, and ...
        // [Thread B] Latch1 is free! go down to resize again. resizing is still true, so take the latch (latch is still Latch1), awaits it. Latch1 is free! go down to resize again, etc.. And get `(depth > maxRetries)` error
        // [Thread A] ... and we are setting `resizing` to false. At last!

        // let's allow only one thread to get in
        if (resizing.compareAndSet(false, true)) {
            countDownLatch.set(new CountDownLatch(1));
            return resize(unlatchAndResetResizing);
        }

Configuration Just development environment. JDK11 and Maven, no more

Runtime (please complete the following information):

rfoltyns commented 3 months ago

Thank you for reporting this! :pray:

You're right, there is a race there.

I'll try to come up with a fix once I replace this test with a more readable and reliable one.