eclipse-aspectj / aspectj

Other
272 stars 82 forks source link

Inheritable thread-local closure stack in JoinPointImpl causing memory leak in spawned threads (e.g. thread pool) #288

Closed pagrawalgit closed 3 months ago

pagrawalgit commented 4 months ago

A few of our company production services saw an increase in HeapAfterGCUse metric after upgrading aspectJ from 1.8.14 to 1.9.9.1. We also tried upgrading to aspectj 1.9.21.1 but we are seeing the same behavior. The increase was significant for one of our services 7GB increase (32GB total heap size).

Upon investigation the team found the increase to be related to the AspectJ commit which causes AjcClosure objects to retain in heap memory in some scenarios.

For the impacted services, following conditions hold true.

  1. You have a method with 2 or more aspect annotations (regardless of what the aspect annotations are).
  2. You spawn a new thread after you enter the method (either from the method itself or somewhere later as part of the call stack).
  3. The spawned thread is long lived (for e.g. a thread in a Threadpool).

The heap dump showed increase in (retained) AjcClosure objects referenced by Thread objects in inheritableThreadLocals. We came across this aspectJ commit which introduced InheritableThreadLocalAroundClosureStack.

We do not see this (heap memory increase) issue with 1.9.8 aspectJ version.

Code imitating our use case (also reproduces the issue):

As you can see below, Demo class spawns a new thread inline. In our use case, joinPoint.proceed() call is NOT async. We use a long living thread in a threadpool which is used to run the inline submitted thread by Demo class.

The Demo$AjcClosure classes (as part of the AroundClosure stack) are copied into inheritableThreadLocal of the spawned thread but never released (as seen in the attached heap snapshot)

package com.demoapp;

import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class Main {
    public static void main(String[] args) throws Exception {
        //Create a Threadpool
        final ExecutorService executorService = Executors.newFixedThreadPool(1);

        // Go through advice.
        final Demo demo = new Demo(executorService);
        demo.doSomething();

        System.gc(); // Best effort GC
        Thread.sleep(500000); // Sleep to take a Heap dump
    }
}
package com.demoapp;

import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;

/**
 * Demo class which has a method with 2 aspect annotations. This method also spawns a new Thread.
 * This Thread will copy inheritableThreadLocal value and leak memory.
 */
public class Demo {
    final ExecutorService taskManager;

    /**
     * Constructor.
     *
     * @param executorService - ExecutorService to spawn a new Thread.
     */
    public Demo(final ExecutorService executorService) {
        this.taskManager = executorService;
    }

    /**
     * Method surrounded to spown a new Thread.
     */
    @DemoAspect1.DemoAnnotation1
    @DemoAspect2.DemoAnnotation2
    public void doSomething() throws ExecutionException, InterruptedException {
        final Future future = taskManager.submit(() -> {
            System.out.println("Hello");
        });
        future.get();
    }
}
package com.demoapp;

import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;

@Aspect
public class DemoAspect1 {

    public DemoAspect1() {
    }

    @Around("execution(* *(..)) && @annotation(DemoAspect1.DemoAnnotation1)")
    public Object doAround(ProceedingJoinPoint jointPoint) throws Throwable {
        System.out.println("DemoAspect1.doAround");
        return jointPoint.proceed();
    }

    public @interface DemoAnnotation1 {
    }
}
package com.demoapp;

import org.aspectj.lang.ProceedingJoinPoint;
import org.aspectj.lang.annotation.Around;
import org.aspectj.lang.annotation.Aspect;

@Aspect
public class DemoAspect2 {
    public DemoAspect2() {
    }

    @Around("execution(* *(..)) && @annotation(DemoAspect2.DemoAnnotation2)")
    public Object doAround(ProceedingJoinPoint jointPoint) throws Throwable {
        System.out.println("DemoAspect2.doAround");
        return jointPoint.proceed();
    }

    public @interface DemoAnnotation2 {
    }
}

Heap Dump (after Full GC)

Screenshot 2024-02-27 at 3 35 43 PM
kriegaex commented 4 months ago

I made an experimental fix, but I did not find a good way to test this automatically yet. Even for manual tests, there is no easy way. Therefore, I need feedback from you. Please try the latest 1.9.22-SNAPSHOT (even though probably it is rather going to be in a 1.9.21.2 release, if it works) from this repository:

<repositories>
    <repository>
        <id>ossrh-snapshots</id>
        <url>https://oss.sonatype.org/content/repositories/snapshots</url>
        <releases>
            <enabled>false</enabled>
        </releases>
        <snapshots>
            <enabled>true</enabled>
            <updatePolicy>always</updatePolicy>
        </snapshots>
    </repository>
</repositories>

If you or your colleagues can provide a better way to test this, please let me know. I do not have enough free cycles to play with this as long as I would like to.

pagrawalgit commented 4 months ago

Thanks @kriegaex for such a quick response and fix. I'll test it today and get back to you. Thanks.

pagrawalgit commented 4 months ago

@kriegaex I quickly tested the fix but it does not fix our issue.

Screenshot 2024-03-02 at 10 53 53 AM

The change that you made would only clear the Stack for the spawning thread but the spawned thread would still hold the copied stack right?

kriegaex commented 4 months ago

You said, the problem occurs independent of number of threads. I did not have much time for looking into this. I would really appreciate a failing test which would pass with a correct fix in place. So, please provide one or multiple automated tests. Thanks in advance.

pagrawalgit commented 4 months ago

@kriegaex I'll try to write one. Just to be clear, the problem for us is the spawned (spawned from the threadpool) thread holding up the memory in inhertiableThreadLocal.

kriegaex commented 4 months ago

Thank you, @pagrawalgit. Whatever I do when running your program with AspectJ 1.9.21.1 on JDK 21 or 1.9.9.1 on JDK 8, in my memory dumps the threads all have inheritableThreadLocals values of null.

kriegaex commented 4 months ago

One more question: Are you using compile-time or load-time weaving? I am trying to reproduce your problem with CTW.

kriegaex commented 4 months ago

Thank you, @pagrawalgit. Whatever I do when running your program with AspectJ 1.9.21.1 on JDK 21 or 1.9.9.1 on JDK 8, in my memory dumps the threads all have inheritableThreadLocals values of null.

OK, the around advices were inlined automatically in your simple example with default compiler options. I had to use -XnoInline to get the relevant method JoinPointImpl::stack$AroundClosure called at all. But now, at least I can see the effect you described in the memory dumps. That is a start.

pagrawalgit commented 4 months ago

Thanks @kriegaex . I should have mentioned this earlier, we are using load time weaving. I'll post code with automated tests this afternoon EST.

pagrawalgit commented 4 months ago

@kriegaex

I am using JDK17. I ran tests using IntelliJ with junit 5. I am assuming <=8GB (xmx) of java heap memory allocated in one of the tests.

Command used for LTW:

-ea
-Dorg.aspectj.weaver.loadtime.configuration=META-INF/aop.xml
--add-opens=java.base/java.lang=ALL-UNNAMED
-javaagent:[PATH_TO_ASPECTJWEAVER]

Note: These are a couple of quick and dirty tests I wrote. Both of these pass with 1.9.8 and fail with later versions. If it makes it easier I can push the code to a git repo.

Class DemoAspect1 and DemoAspect2 are unchanged. I deleted the Main class.

package com.demoapp;

import java.util.concurrent.ExecutionException;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Future;

/**
 * Demo class which has a method with 2 aspect annotations. This method also spawns a new Thread.
 * This Thread will copy inheritableThreadLocal value and leak memory.
 */
public class Demo {
    final ExecutorService taskManager;
    final byte[] someBigData = new byte[1024 * 1024 *1024]; //1GB worth of data.

    /**
     * Constructor.
     *
     * @param executorService - ExecutorService to spawn a new Thread.
     */
    public Demo(final ExecutorService executorService) {
        this.taskManager = executorService;
    }

    /**
     * Method surrounded to spown a new Thread.
     */
    @DemoAspect1.DemoAnnotation1
    @DemoAspect2.DemoAnnotation2
    public void doSomething() throws ExecutionException, InterruptedException {
        final Future future = taskManager.submit(() -> {
            System.out.println("Hello");
        });
        future.get();
    }
}
import com.demoapp.Demo;
import org.junit.jupiter.api.Assertions;

import java.lang.reflect.Field;
import java.util.ArrayList;
import java.util.List;
import java.util.Set;
import java.util.concurrent.ExecutorService;
import java.util.concurrent.Executors;

public class TestMemoryLeak {

    /**
     *
     * @param executorServiceList - list of threadpools.
     *
     */
    private void testAspectJLeak(final List<ExecutorService> executorServiceList) throws Exception {
        //Create a Threadpool list to keep the Threads alive until completion.

        for (int i = 0; i < executorServiceList.size() ; i++) {
            final Demo demo = new Demo(executorServiceList.get(i));
            demo.doSomething();
        }

        System.gc(); // Best effort GC
        //Thread.sleep(500000); // Sleep to take a Heap dump
        System.out.println("Done");
    }

    /**
     *
     * Tests that the inheritableThreadLocals of the spawned threads are either null or contain all null elements.
     */
    @org.junit.jupiter.api.Test
    public void testNoMemoryLeak_InheritableThreadLocalCleared() throws Exception {
        final int numThreadPools = 1;
        // Create a new threadpool for every thread to be spawned, to make sure that the threads gets
        // initialized for the first time. This allows the spawned thread to copy data into inheritableThreadLocals.
        final List<ExecutorService> executorServiceList = getThreadPools(numThreadPools);
        //run test
        testAspectJLeak(executorServiceList);

        // Verify if the spawned threads have null inheritableThreadLocals after completion.
        final Set<Thread> threads  = Thread.getAllStackTraces().keySet();

        threads.stream()
                // Could further disambiguate by adding custom thread naming.
                .filter(thread -> thread.getName().contains("pool"))
                .forEach(thread -> {
                    try {
                        final Field mapField = Thread.class.getDeclaredField("inheritableThreadLocals");
                        mapField.setAccessible(true);
                        final Object o = mapField.get(thread);

                        if (o != null) {
                            final Field tableField = o.getClass().getDeclaredField("table");
                            tableField.setAccessible(true);

                            final Object[] inheritableThreadLocalTable = (Object[]) tableField.get(o);
                            if (inheritableThreadLocalTable != null) {
                                for (Object entry : inheritableThreadLocalTable) {
                                    // every entry should be null.
                                    Assertions.assertNull(entry);
                                }
                            }
                        }
                    } catch (Exception e) {
                        throw new RuntimeException(e);
                    }
                });

        System.out.println("Test passed");
    }

    /**
     *
     * Spawns 10 threads in separate threadpools to make sure each thread when spawned gets initialized for
     * the first time and allocates new inheritableThreadLocals memory.
     *
     * Since each thread will hold up memory (memory leak case), the program will run of memory.
     *
     * The fix would eliminate OOM.
     */
    @org.junit.jupiter.api.Test
    public void testNoMemoryLeak_SystemShouldNotRunOutOfMemory() throws Exception{
        final int numThreadPools = 10;
        final List<ExecutorService> executorServiceList = getThreadPools(numThreadPools);
        testAspectJLeak(executorServiceList);
        System.out.println("Test passed");
    }

    /**
     * Creates 1 thread per Threadpool.
     *
     * @param numThreadPools - Number of threadpools to create.
     *
     */
    private List<ExecutorService> getThreadPools(int numThreadPools) {
        final List<ExecutorService> executorServiceList = new ArrayList<>();
        for (int i = 0 ; i < numThreadPools; i++) {
            // Could further disambiguate by adding custom thread naming.
            executorServiceList.add(Executors.newFixedThreadPool(1));
        }
        return executorServiceList;
    }
}
kriegaex commented 4 months ago
final Field mapField = Thread.class.getDeclaredField("inheritableThreadLocals");
mapField.setAccessible(true);
final Object o = mapField.get(thread);

This looks quasi identical to the code I was using in my own impromptu test code. 😉 Like I said, where I was stuck until yesterday was that I could not reproduce the problem with CTW, but now I can with the additional compiler switch.

Your test code comes in handy, and I guess that at least part of it might end up in the AspectJ test bed, after I have an idea how to fix the problem. Currently, I do not, but I am also busy with non-AspectJ matters.

pagrawalgit commented 4 months ago

Thanks @kriegaex

kriegaex commented 4 months ago

I still do not have any good idea how to fix this, because non-inlined annotation-style aspects do need to keep around-closure information due to the possibility of multiple proceed calls in the same advice or proceed happening in a spawned thread. For native syntax aspects, it works differently, because the AspectJ compiler generates advice methods that directly take closures as parameters, while annotation-style advices either have no parameters or a proceeding join point instance. I.e., in this case the around closures must be bound and unbound manually. As AspectJ has no way of knowing if a long-lived thread might eventually issue a proceed call and therefore need the closure information, we cannot just discard it.

I could revert 3c80a36527, but then asynchronous nested proceeds would fail to work again. Currently, it looks like a choice between the plague and cholera, as we say in German.

The suggested solution for now is to simply use native syntax aspects and the AspectJ compiler to compile them. Something like this:

public aspect DemoAspect2Native {
  Object around() : execution(* *(..)) && @annotation(DemoAnnotation2) {
    System.out.println("DemoAspect2Native around");
    return proceed();
  }

  public @interface DemoAnnotation2 {}
}

Then, your problem should go away. I tried with your out-of-memory test, and it passes.

I am keeping this issue open. BTW, it is related to #141. If I have an idea how to solve that one, probably this one would also be solved.

pagrawalgit commented 4 months ago

I'll read up on native syntax aspects vs non-inlined annotation-style aspects.

kriegaex commented 4 months ago

I have just deployed an alternative experimental fix to the snapshot repository. It makes your tests pass and also seems to fix #141 in one shot. The strategy is to use a list instead of a stack and never to pop any around closure elements from it. But at least, it is not thread-local. Instead, what is thread-local is an index for the current element, i.e. what leaks now are only integer values and the stack of around closures, until the joinpoint instance goes out of scope, which I guess is fine. It is not perfect, but hopefully better than the status quo. Please re-test. Thank you.

kriegaex commented 4 months ago

Force-pushed regression tests without fix. I want to see them fail in this CI build first before pushing the fix on top of them.

Thanks to @pagrawalgit for the code contribution. I adapted the test to instead of a JUnit test, because that better fits the AspectJ integration test structure, but both test cases are basically what you suggested.


Update: The two regression tests expected to fail actually do fail. Fine. 👍

pagrawalgit commented 4 months ago

Thanks @kriegaex. I'll test this today.

kriegaex commented 4 months ago

This CI build contains the fix. All tests should pass for it. If they do, I will deploy another snapshot, and if it also works for @pagrawalgit in his application, I can merge it.


Update: That build passed, including the reactivated test for #141. The fix seems to be a substantial improvement and fix two problems in one shot.

kriegaex commented 4 months ago

If you want to test, I just deployed commit 4414b42e3d with the polished fix.

kriegaex commented 4 months ago

@pagrawalgit, sorry for pushing this, but my time window for creating an AspectJ maintenance release is closing. Soon, I will be busy with non-AspectJ stuff again. I would like this to be in, but not without your team's re-test feedback.

pagrawalgit commented 4 months ago

@kriegaex I tested it locally with the demo application I have. The tests pass.

Testing with the actual service will take me a day or two. Feel free to push the commit. Thanks for getting this out so quickly.

kriegaex commented 3 months ago

@pagrawalgit: ping!

pagrawalgit commented 3 months ago

It works!!

pagrawalgit commented 3 months ago

@kriegaex Thanks for the commit. So, the fix would be part of 1.9.21.2 release? When are you planning to release it?

kriegaex commented 3 months ago

@pagrawalgit, AspectJ 1.9.21.2 has been released today.